mratsim / constantine

Constantine: modular, high-performance, zero-dependency cryptography stack for verifiable computation, proof systems and blockchain protocols.
Other
413 stars 44 forks source link

Fuzzing #54

Closed mratsim closed 1 year ago

mratsim commented 4 years ago

See https://github.com/status-im/nim-blscurve/pull/53/files for fuzzing with libFuzzer and AFL.

Regarding corpus creation, Klee might be interesting to look into: https://klee.github.io/, https://srg.doc.ic.ac.uk/klee18/talks/Zmyslowski-Feeding-the-Fuzzers-with-KLEE.pdf

Also OSS-Fuzz is running ecc-diff-fuzzer: https://github.com/google/oss-fuzz/pull/3408, https://github.com/catenacyber/elliptic-curve-differential-fuzzer Though there doesn't seem to be any differential fuzzer for pairing-based cryptography.

mratsim commented 4 years ago

Easy initial fuzzing targets:

Note: Coverage-guided fuzzers like libFuzzer try to trigger all codepaths based on branches in the code. Constantine doesn't have branch which makes fuzzing harder. Have to find the article/paper that mentioned that, apparently for fuzzing they reintroduced branches (how?) to help the fuzzer.

https://blog.fuzzing-project.org/31-Fuzzing-Math-miscalculations-in-OpenSSLs-BN_mod_exp-CVE-2015-3193.html

Fuzzing versus branch-free code

After reporting the bug I was asked by the OpenSSL developers if I could do a similar test on their HMAC implementation. I did that and the result is interesting. At first I was confused: A while after the fuzzing started american fuzzy lop was only reporting two code paths. Usually it finds dozends of code paths within seconds.

This happens because cryptographic code is often implemented in a branch-free way. That means that there are no if-blocks that will execute different parts of the code depending on the input. The reason this is done is to protect against all sorts of sidechannel attacks. This conflicts with the way modern fuzzers like american fuzzy lop or libfuzzer work. They use the detection of new code paths as a way to be smart about their inputs.

Pascal Cuoq on Friday, December 4. 2015: You can re-introduce, for the purpose of fuzzing, the if-then-elses that, for the purpose of avoiding timing attacks, have been made into constant-time selections with a patch similar to the one shown here for an old version of OpenSSL:

http://pastebin.com/rdLyQRVU

paulmillr commented 3 years ago

cc @guidovranken who is a world-class fuzzer - he may find this library and fuzzing ideas interesting

guidovranken commented 3 years ago

Thanks for pinging me @paulmillr !

@mratsim My project supports a wide range of operations including pairing cryptography and bignum operations and it has found hundreds of bugs in major libraries. Let me know if you'd like to integrate a module for constantine..

mratsim commented 2 years ago

Some progress towards fuzzing.

There is a new bindings generator, which can be called with nimble bindings which will generate a DLL for BLS12_381 and the Pasta curves and the accompanying headers.

For now serialization is restricted to only field elements Fp and Fr and the dll wasn't tested at all.

Before running the actual code the "NimMain" function like ctt_bls12381_NimMain should be called, it populates CPU runtime detection (for now that's the only runtime stuff). On that note, I might add a pure C compilation target for fuzzing that as well.

Example bindings: https://github.com/mratsim/constantine/blob/37354e9/bindings/generated/constantine_bls12_381.h

Some example C code to load that and property-based test the code or differential fuzz vs GMP in the CI will be added in the future as an example.

mratsim commented 1 year ago

Constantine has been integrated in OSS-fuzz in https://github.com/google/oss-fuzz/pull/10710 through https://github.com/guidovranken/cryptofuzz