guidovranken / cryptofuzz

Fuzzing cryptographic libraries. Magic bug printer go brrrr.
https://guidovranken.com/2019/05/14/differential-fuzzing-of-cryptographic-libraries/
GNU General Public License v3.0
673 stars 79 forks source link

Order of evaluation issues #81

Closed dergoegge closed 2 months ago

dergoegge commented 2 months ago

In C++: Order of evaluation of any part of any expression, including order of evaluation of function arguments is unspecified.

Cryptofuzz assumes a fixed order of evaluation which is fine when only using one compiler, but as soon as results are compared across compilers, differences will manifest due to different evaluation order choices of the used compilers.

For example, looking at the openssl module: https://github.com/guidovranken/cryptofuzz/blob/292701cc5e0aa0ae61e4b41ee5bead024ebb218e/modules/openssl/module.cpp#L3452

group, pub and prv all own a reference to the same Datasource and consume from it when GetPtr is called, but the order in which GetPtr will be called on each of the three is up to the compiler. Given the same fuzz input, this leads to a different EC_POINT_mul result between e.g. gcc and clang.

I'm not sure what the best approach would be to find and fix all instances of this (there were some suggestion on a Bitcoin Core PR (https://github.com/bitcoin/bitcoin/pull/29043) dealing with the same issue: using -Wsequence-point or a clang-tidy query, but they all have their limits).

So far, I've only observed this in the openssl module for OpECC_Point_Mul and OpECC_PrivateToPublic but I could see how there are others (probably most of the *ssl modules, they seem to have very similar code).

I'm happy to send patches for any occurrences I come across but a more comprehensive approach would obviously be some kind of linter to find all current and future occurrences.

guidovranken commented 2 months ago

Thank you for raising this issue. The code base is filled with instances of this. I asked an LLM and it came up with a few solutions:

Have you explored any of these?

dergoegge commented 2 months ago

All the approaches the LLM suggested don't actually work, except for "immediately invoked lambdas" (see https://godbolt.org/z/9K5rYsahP). That approach works because it pulls the function calls out into individual assignments.

I've drafted a branch that introduces a macro CF_ORDERED_EVAL that avoids having to manually type out the individual assignments (see https://godbolt.org/z/3qrhGvvaG). With the macro, the refactoring that is needed should usually be contained to one line:

CF_CHECK_NE(EC_POINT_mul(group->GetPtr(), res->GetPtr(), nullptr, a->GetPtr(), b.GetPtr(), nullptr), 0);
// Using CF_ORDERED_EVAL this would turn into:
CF_CHECK_NE(std::apply(EC_POINT_mul, CF_ORDERED_EVAL(group->GetPtr(), res->GetPtr(), nullptr, a->GetPtr(), b.GetPtr(), nullptr)), 0);

What do you think of this approach?

guidovranken commented 2 months ago

Upon second thought the worst that can happen with a discrepancy in order of evaluation is that one instance returns a result, while the other returns std::nullopt. Cryptofuzz' internal comparison code ignores all std::nullopt results. Does Semsan have some kind of mechanism to optionally ignore a result? If it does, that would solve it, without making all kinds of changes to the Cryptofuzz harnesses.

dergoegge commented 2 months ago

Does Semsan have some kind of mechanism to optionally ignore a result?

This isn't currently implemented but it would be possible and it sounds like the best option for now, so I'm gonna close this issue.