supranational / blst

Multilingual BLS12-381 signature library
Apache License 2.0
463 stars 178 forks source link

Windows: bool is 4-bytes by default #189

Closed michaelsproul closed 1 year ago

michaelsproul commented 1 year ago

On Windows, the __STDC_VERSION__ value isn't defined by default, and requires a specific compiler flag to be added (e.g. /std:c11). As a result, blst's fallback logic to #define bool int triggers:

https://github.com/supranational/blst/blob/78fee18b25e16975e27b2d0314f6a323a23e6e83/bindings/blst.h#L28-L31

This creates issues for downstream dependencies that #include "blst.h" and don't expect it to re-define bool to int and for blst's own language bindings, which might assume that bool is 1-byte. E.g. in the Rust bindings, bool appears in both return value and argument position:

https://github.com/supranational/blst/blob/78fee18b25e16975e27b2d0314f6a323a23e6e83/bindings/rust/src/bindings.rs#L255-L257

https://github.com/supranational/blst/blob/78fee18b25e16975e27b2d0314f6a323a23e6e83/bindings/rust/src/bindings.rs#L279-L281

Rust expects bool to always be 1-byte, so this is liable to crash at runtime due to misaligned reads/writes. The developers of the c-kzg-4844 library spent several weeks debugging mysterious segfaults on Windows before tracing it back to this #define bool int (see https://github.com/ethereum/c-kzg-4844/pull/354). Granted, this is also an issue with Rust's bindgen assuming that C's bool matches its own.

In terms of fixes, I think there are a few options:

dot-asm commented 1 year ago

Rust expects bool to always be 1-byte, so this is liable to crash at runtime due to misaligned reads/writes.

  1. What Rust does internally is formally irrelevant in the context, because it's obliged to interface with blst by C rules, due to extern "C" declaration.
  2. The misalignment comes into picture only if you actually store the value and misinterpret its type. While what we are looking at here is argument passing. One could discuss misalignment only if the values were passed by reference, which is not the case here.
  3. Essential to note that argument passing is not part of the language standard and is left to implementation, specification of which is referred to as calling convention.
  4. All supported C calling conventions, including Microsoft, are known to widen narrow arguments being passed to and from subroutine to at least 32 bits. ["At least" refers to the fact that on supported 64-bit platforms widening implies zeroing the upper half of the corresponding argument register.]
  5. Relying on this is not a result of an oversight, but a conscious choice driven by security requirements, constant-time-ness to be specific.

This is not to say that the definition in question can't possibly cause problems elsewhere, only that the referred argument is effectively moot in the specifica context of this query. BTW, with 1. in mind, if the problem emerged with a Rust update, one can wonder if Rust broke the contract and should be held responsible. Though I still fail to imagine how would it be a problem on argument passing...

As for what to do. One has to recognize that blst.h is not used when the C part or Rust bindings are compiled. So that passing /std:c11 in blst build script won't make any difference. The only thing one can do is to see if bool is defined and act accordingly. I'll make a suggestion later on...

michaelsproul commented 1 year ago

Thanks for the context and corrections! I see now that blst.h and the Rust bindings are a safe interface to blst if used without additional C code. The problem in c-kzg-4844 was that it included blst.h and then defined several functions with * bool arguments:

https://github.com/ethereum/c-kzg-4844/blob/d35b0f3854ab114b48daa9b504f6ee085c61508a/src/c_kzg_4844.h#L210-L217

This made it possible for unaligned and out-of-bounds accesses to occur when the caller passed a pointer to a single byte (with byte alignment). I think it broke when we updated Rust/LLVM because we were relying on undefined behaviour -- the compiler was within its rights to do whatever it wanted. I also agree that this is a deficiency of Rust's bindgen, which ideally would have run the C preprocessor and realised that bool was 4 bytes.

I'm not an expert in blst's code nor the details of C, so I don't mind if you ignore my suggested fixes. I think you're better positioned to come up with something that fits with the rest of the codebase and your design goals. Thanks :)

dot-asm commented 1 year ago

bool *

Yes, misinterpreting type would be a problem in this case.

deficiency of Rust's bindgen, which ideally would have run the C preprocessor...

It actually does that. And then it maps some of the C types directly into corresponding Rust types. For example C _Bool to bool, size_t to usize, uint32_t to u32, etc. It's very much appreciated, because it allows you to avoid excessive as casts on the Rust side. Just in case, yes, you have to put the C preprocessor in position to expose specifically _Bool to get the desired mapping. Which by the way is why it's a macro in blst.h and not a typedef. (And as already mentioned, it works reliably for by-value arguments even if there is a type mismatch, thanks to the way calling conventions are specified.)

As for committed fix. One could extend __STDC_VERSION__>=199901 with || _MSC_VER >=1928, but I've chosen to not play the Microsoft mind games:-)

Thanks!