Spectre-v1 safety: How to prevent merging of asm volatile

andyhhp commented 2 years ago

Upstream patch/discussion: https://lore.kernel.org/xen-devel/20220425175603.21086-1-andrew.cooper3@citrix.com/ Godbolt example: https://godbolt.org/z/a3c94e61W

Xen has a construct called evaluate_nospec() for Spectre v1 safety, which does its hardest to insert lfence instructions at the head of both basic blocks following a conditional jump. This is attempted with:

static always_inline bool barrier_nospec_true(void)
{
    asm volatile ("lfence" ::: "memory");
    return true;
}
static always_inline bool eval_nospec(bool cond)
{
    return cond ? barrier_nospec_true() : !barrier_nospec_true();
}

which is intended to be used with code such as:

void test(int x)
{
    if ( eval_nospec(x) )
        asm volatile ("nop #nospec-if");
    else
        asm volatile ("nop #nospec-else");
}

to generate code such as:

test:                              # @test_good
        testl   %edi, %edi
        je      .LBB1_2
        lfence
        nop
        retq
.LBB1_2:
        lfence
        nop
        retq

However, this doesn't compile as expected under Clang. Clang chooses to merge both sides of the ternary in eval_nospec() and hoist the the lfence up ahead of the je, resulting in:

test:                               # @test_bad
        lfence
        testl   %edi, %edi
        je      .LBB0_2
        nop
        retq
.LBB0_2:
        nop 
        retq

which is no longer safe to Spectre v1.

After some experimentation, it turns out that putting different comments in the asm volatile()'s prevents the two sides being merged, which has intended safety property. But this is also very fragile.

Is there a supported way of ensuring the two asm volatile()'s don't get merged, which isn't as hacky as depending on magic comments? If not, can we discuss how to create a supported way of adding Spectre v1 safety which the optimiser won't interfere with?

efriedma-quic commented 2 years ago

In general, it's hard to preserve Spectre-related properties of code without the compiler's cooperation. From a compiler engineering perspective, it's lot easier to just describe the properties you want the generated code to have, and let the compiler figure out how to satisfy them. For example, compiler flags like -mspeculative-load-hardening, and -mseses . (Although neither of those quite do what you want here, I think?)

If you want a particular sequence of test+branch+lfence, without telling the compiler what you're actually trying to do, the reliable way to do that is to put the entire sequence inside an inline asm block. (You can use "asm goto" to jump out afterwards.) Granted, even then, I'm not sure that preserves all the properties of the code you care about.

nickdesaulniers commented 2 years ago

Does this not work?

#include <stdbool.h>

#define CONFIG_SPECULATIVE_HARDEN_BRANCH

bool evaluate_nospec(bool condition) {
#ifdef CONFIG_SPECULATIVE_HARDEN_BRANCH
    asm goto (
        "test %0, %0\n\t"
        "je %l1\n\t"
        "lfence\n\t"
        "nop"
        :: "r"(condition) :: fail);
    return true;
fail:
    asm("lfence\n\t"
        "nop");
    return false;
#else
    return condition;
#endif
}

https://godbolt.org/z/e93xY968e

andyhhp commented 2 years ago

In general, it's hard to preserve Spectre-related properties of code without the compiler's cooperation.

:slightly_smiling_face: Tell me about it. Almost 5 years, and the mess is still going strong...

I'm aware of Speculative Load Hardening, and it is (to my knowledge) still not safe for kernel mode (needs to invert it's use of 0's and -1's with respect to pushing a speculatively-out-of-bounds pointer to a safe address).

I wasn't aware of SESES, but it's buggy. It makes exactly the same code generation mistake as this example with Spectre v1. https://godbolt.org/z/jGPn116YM

seses_broken:                           # @seses_broken
        testl   %edi, %edi
        lfence                          # <--- This lfence is misplaced.  There should be two,
        je      .LBB0_2
                                        # <--- one here, and
        nop
        popq    %rax
        lfence
        jmpq    *%rax
.LBB0_2:
                                        # <--- one here.
        nop
        popq    %rax
        lfence
        jmpq    *%rax

As for the code, we specifically do not want to hardcode test+branch+lfence. We want something that can be written as above, where the compiler can choose the appropriate compare&branch sequence.

In fact, I discovered this code generation problem when trying to optimise eval_nospec() using __builtin_constant_p() to avoid emitting an lfence when the if condition does collapse into a constant. Another case which is safe is if the expression can be converted down into a cmov, but I haven't thought of any way (other than a compiler extension) to cover that case.

llvm / llvm-project

Spectre-v1 safety: How to prevent merging of asm volatile #55084