bitcoin-core / secp256k1

Optimized C library for EC operations on curve secp256k1
MIT License
2.07k stars 1k forks source link

secp256k1_scalar_cmov not constant-time on ICC #777

Open real-or-random opened 4 years ago

real-or-random commented 4 years ago

AMD64 ICC 19.1.2.254

==407113== Conditional jump or move depends on uninitialised value(s)
==407113==    at 0x485FB93: secp256k1_ec_pubkey_create (secp256k1.c:568)
==407113==    by 0x401490: main (valgrind_ctime_test.c:56)
==407113== 
==407113== Conditional jump or move depends on uninitialised value(s)
==407113==    at 0x48674AA: secp256k1_scalar_is_zero (secp256k1.c:521)
==407113==    by 0x48674AA: secp256k1_scalar_set_b32_seckey (scalar_impl.h:61)
==407113==    by 0x48674AA: secp256k1_scalar_cmov (secp256k1.c:496)
==407113==    by 0x48674AA: secp256k1_ecdsa_sign_inner (secp256k1.c:488)
==407113==    by 0x48670BE: secp256k1_ecdsa_sign_recoverable (main_impl.h:132)
==407113==    by 0x401821: main (valgrind_ctime_test.c:81)

Your volatile tricks do not fool ICC.

Originally posted by @gmaxwell in https://github.com/bitcoin-core/secp256k1/pull/772#issuecomment-664812966

real-or-random commented 4 years ago

The first two options are ok of this is ok as long as compilers don't start to play around with the cmov functions that we really use in the arithmetic routines.

At the moment, I tend to believe that we should keep the three functions, i.e., add the volatile to secp256k1_scalar_cmov. As a second step, we could try to benchmark https://github.com/bitcoin-core/secp256k1/pull/457#issuecomment-604446374 for the other functions. If it's ok to change this in the performance-critical functions (I doubt it), then we should also do it here. Otherwise, ASM cmovs are interesting.

gmaxwell commented 4 years ago

The only reservation I have about the volatile trick is that there is a long history of compilers emitting broken code in the presence of volatile because almost nothing uses it. This has improved in recent times (like post GCC 7-ish) Otherwise it seems pretty reasonable and clean too. I didn't raise a concern about this because the particular usage where the volatile variable is assigned once then read from with nothing too interesting going on (no complex control flow, no scopes, etc.) is probably a usage that is least likely to expose bugs. It's my unverified belief that the volatile should have ~no performance impact, or at least we should be able to make a version that does.

MISRA 2012 Rule 13.2 "The value of an expression and its persistent side effects shall be the same under all permitted evaluation orders". It demands that you make at most one read and at most one write. The existing workaround doesn't violate the spirit of the rule, but the issue could be avoided by making a volatile sandwich-- copy the input into a volatile and back out again-- if there is a performance loss from doing the work around (e.g. causing a memory read for each moved word in the cmov), I bet the volatile sandwich would fix it.

I dubious of other approaches. If a compiler can't be trusted to leave simple bit operations alone all bets are off. E.g. cmovs using multiplies are at greater risk from compilers aggressively trying to eliminate multiplies (which I think was a factor in the ecdh u_last issue).

x86/x86_64 assembly is pretty maintainable, but as you've probably noticed w/ arm ... it's not always so easy. I think we ought to try for something in C if we're at all able. ASM would let us be more sure and maybe faster-- where its supported.

real-or-random commented 4 years ago

MISRA 2012 Rule 13.2 "The value of an expression and its persistent side effects shall be the same under all permitted evaluation orders". It demands that you make at most one read and at most one write. The existing workaround doesn't violate the spirit of the rule, but the issue could be avoided by making a volatile sandwich-- copy the input into a volatile and back out again-- if there is a performance loss from doing the work around (e.g. causing a memory read for each moved word in the cmov), I bet the volatile sandwich would fix it.

Sorry I can't follow. How is the existing workaround related to this rule?

How would a volatile sandwich avoid memory accesses? Wouldn't it even introduce a second memory access?

I agree with all you say otherwise.

gmaxwell commented 4 years ago

fun(int x) volatile int tmp1 = x; int tmp2 = tmp1;

then use tmp2. So there is only one read from tmp1 and it can make a register copy of it.

real-or-random commented 4 years ago

fun(int x) volatile int tmp1 = x; int tmp2 = tmp1;

then use tmp2. So there is only one read from tmp1 and it can make a register copy of it.

We talked somewhere else and figured out that @gmaxwell was under the wrong impression that the current volatile code is not of this form. But it is of this form. So this is not an issue.