bitcoin-core / secp256k1

Optimized C library for EC operations on curve secp256k1
MIT License
2.03k stars 983 forks source link

cross-testing #691

Open real-or-random opened 4 years ago

real-or-random commented 4 years ago

In #635 I was convinced that cross-testing is a good thing. It would be nice to have cross tests for

elichai commented 4 years ago

Just linking to https://github.com/bitcoin-core/secp256k1/pull/641 (which I stopped because there was no more discussion around if it's wanted and how to vendor the code)

real-or-random commented 4 years ago

@gmaxwell suggests micro-ecc in #716. I think that's by far the simplest alternative (simple, mostly portable C), see #716 for a detailed discussion. I'd prefer this over the other mentioned projects. What do people think?

real-or-random commented 2 years ago

Now after the OpenSSL tests are gone, we should make this a priority.

Here are some possibilities:

I think micro-ecc is a nice option and a good first choice. @elichai would you still interested in getting this in?

641 got stuck because there was no clear commitment to using Rust. I wouldn't opposed to other languages than C in the test but here's a suggestion that I find very attractive: We could compile the Rust down to (ugly) C using mrustc and vendor this C code. This would mean that we get the best of both worlds: One can compare to an independent simple implementation with the same API and one does not need a Rust compiler to run the tests. The same is true for secp256kfun but I don't know how independent that implementation is. (cc @LLFourn)

But before we decide on the way forward, here's another very interesting direction: A much more sophisticated approach to this kind of testing is cryptofuzz by @guidovranken, which does differential testings of all kinds of algorithms and libraries. For example, it's used to test libsecp256k1 against Botan on oss-fuzz (https://github.com/google/oss-fuzz/pull/5717).

@guidovranken Were you aware of our OpenSSL cross tests (removed in https://github.com/bitcoin-core/secp256k1/pull/983)? I really wonder how crytofuzz compares against these. I can imagine it would be superior to simply use cryptofuzz and maybe integrate more libraries such as some of the ones mentioned above.

guidovranken commented 2 years ago

Apart from testing sign/verify/recover, it is also a good idea to test point validation, addition and multiplication.

real-or-random commented 2 years ago

Thanks, that's a helpful overview! Unfortunately the Wycheproof tests do not match our semantics entirely, see https://github.com/google/wycheproof/issues/70 .

@guidovranken Can you also comment on the high-level ideas here? For example, when it comes to functionatily, do you think cryptofuzz is superior in every aspect, or do you think it would still make sense for us to maintain our own cross tests?

Of course there are also other considerations besides functionality, e.g., build integration etc. For example, if we think we stick to maintain own tests because it's easier to ship them to the user, I imagine we could still have a short run of cryptofuzz on CI.

LLFourn commented 2 years ago

The same is true for secp256kfun but I don't know how independent that implementation is. (cc @LLFourn)

The best rust impl to try this approach for would be k256 since it is a completely independent code base and even uses different algorithms for ECC (it uses projective addition formula from https://eprint.iacr.org/2015/1060.pdf). I was going to move secp256kfun over to using it as a backend eventually anyway.

guidovranken commented 2 years ago

@guidovranken Can you also comment on the high-level ideas here? For example, when it comes to functionatily, do you think cryptofuzz is superior in every aspect, or do you think it would still make sense for us to maintain our own cross tests?

A notable shortcoming of CF is that it only compares the results produced by various crypto libraries, but not the absence of results. For example it doesn't confirm the absence of a result in these cases:

Technically, each operation run in a crypto library returns a std::optional<Result>. It returns std::nullopt if it cannot compute a result for any reason, and it returns a Result if it can. Once the operation has been run in every crypto library, CF filters out all results which are std::nullopt. The remaining set is then compared for differences.

So in Python-pseudocode:

results = []

# Run operation in each crypto library
for module in modules:
    results += [ module.Run(operation) ]

# Filter out nullopt results
results = filter(lambda r: r != None, results)

# Detect discrepancies
if len(set(results)) != 1:
    crash()

This logic exists for a good reason in CF but the corollary is that it can miss certain types of bugs. I do want to address this at some point in the future but for now having a separate test suite is still valuable.

elichai commented 2 years ago

The best rust impl to try this approach for would be k256 since it is a completely independent code base and even uses different algorithms for ECC (it uses projective addition formula from https://eprint.iacr.org/2015/1060.pdf).

I'm not sure how true this is, See for example: https://github.com/RustCrypto/elliptic-curves/pull/59, https://github.com/RustCrypto/elliptic-curves/pull/82

LLFourn commented 2 years ago

The best rust impl to try this approach for would be k256 since it is a completely independent code base and even uses different algorithms for ECC (it uses projective addition formula from https://eprint.iacr.org/2015/1060.pdf).

I'm not sure how true this is, See for example: RustCrypto/elliptic-curves#59, RustCrypto/elliptic-curves#82

Right. To be clear I think they use different ECC addition/doubling formula -- I guess field/scalar arithmetic could be pretty similar. I wouldn't be the best judge of that. Still I think it's the best candidate if you want to try cross testing against a rust codebase.

real-or-random commented 2 years ago

This logic exists for a good reason in CF but the corollary is that it can miss certain types of bugs. I do want to address this at some point in the future but for now having a separate test suite is still valuable.

Can you elaborate what these good reasons are? Maybe we could help.

edit: Let me clarify that even if cryptofuzz would be perfect, I still believe it's useful to have simple cross tests in our test suite. cryptofuzz is nothing we will integrate in make check (naturally a lot of dependencies, etc). I think it's good to have something which is easy enough to be run by not only by us but also by our users, packagers, etc. It should ideally just work out of the box without any dependencies.

guidovranken commented 2 years ago

The "good reason" is that I allow every library to fail executing an operation for any or no reason. Reasons include:

In some cases it is dubious whether a certain function failing is signalling some specific information. For example, if a function which computes a modular inverse fails, is it due to an internal error (e.g. malloc failure) or is it signalling that the inverse does not exist? I need to review these on a case-by-case basis, so I cannot simply interpret failure as a bug by default.

Currently, the ECC point operations all return a std::optional<component::ECC_Point>. The component::ECC_Point struct is something like this:

struct {
    std::string X;
    std::string Y;
};

In hindsight it would have been better if this struct also included a field bool is_infinity. I could change this, but I would have to do it for all modules, otherwise the whole thing won't compile anymore, so that will take some time.

As a workaround I've started to add asserts to various libraries to ensure certain operations conform to certain expectations.

For example, in secp256k1, I now have:

I want to add similar asserts for point addition and point doubling, where the logic will be:

Feel free to specify additional invariants for any secp256k1 function/operation and I will implement it in the fuzzer.

real-or-random commented 2 years ago

@guidovranken Makes sense. Here's another random thought (that may be stupid): Did you ever consider adding an old version of libsecp256k1? We have done so many changes that this may be interesting. On the other hand, I don't know if this adds much to the diversity of implementations. (Maybe it's useful to cross-test new libsecp256k1 against old libsecp256k1 but not if you're anyway testing in parallel against all the other libraries?)

guidovranken commented 2 years ago

@real-or-random

Testing two versions of the same library in the same fuzzer binary is difficult due to symbol collisions, though it is possible by launching an external process.

After your suggestion I made some changes to the harness to allow compiling it with older versions of libsecp256k1 (specifically at the last commit of 2019 and the last commit of 2018 -- these were arbitrary choices): https://github.com/guidovranken/cryptofuzz/blob/master/modules/secp256k1/README.md

I ran the fuzzer on those and did not find anything amiss.

Are there any specific older versions that you think are worthwhile to test? Maybe specific releases of the library that are still in widespread use (for example by Ethereum)? I suppose it would be valuable to discover bugs in older versions even if they are fixed in the current version. Sometimes software projects fix bugs while addressing something else, without ever knowing there was a bug, so this can be meaningful. I can even add an additional fuzzer for an older libsecp256k1 version to the OSS-Fuzz project.

apoelstra commented 2 years ago

@guidovranken one potential source of old versions might be the bundled copies of libsecp256k1 in rust-secp256k1. These are in the directory secp256k1-sys/depend/secp256k1; the revision used is recorded in the file secp256k1-sys/depend/secp256k1-HEAD-revision.txt; and all of the externally linked symbols have been conveniently renamed to include version numbers.

Not commenting on whether this is a useful direction to go in or not, but if you did want to do it, here is a corpus.

real-or-random commented 2 years ago

Are there any specific older versions that you think are worthwhile to test?

I didn't have anything specific in mind. My suggestion was simply based on the observation that a lot of code has been changed during the keys, including replacement of entire internal algorithms (safegcd for modinv, or https://github.com/bitcoin-core/secp256k1/pull/1033/commits/557b31fac36529948709d4bfcc00ad3acb7e83b9 for a very recent example, ...)

I suppose it would be valuable to discover bugs in older versions even if they are fixed in the current version. Sometimes software projects fix bugs while addressing something else, without ever knowing there was a bug, so this can be meaningful.

Indeed, this was my thinking.

Maybe specific releases of the library that are still in widespread use (for example by Ethereum)?

Yeah, those are an obvious choice. A lot of projects vendor our code including (old releases of) Bitcoin Core, and lightning implementations, many many altcoins. I think these are good choices for secp256k1, as are the versions bundled by rust-secp256k1 or other bindings as mentioned by Andrew.

In the end I think there are so many valid choices that I think I would just pick one or two more or less arbitrary. And covering "time" as much as possible is useful and your approach of taking the last commit in a year does the job. I think a mix of old and more recent versions is interesting: A very old version still in use is good as a target to find bugs and more recent versions are good as reference because they tend to have the same API (and support Schnorrsig etc).

I think if you run cryptofuzz with libsecp256k1 plus 10 other libraries, then I'd suppose adding an old version does not add much if you want to find bugs in the current version. But if you want to find bugs in older versions of libsecp256k1, then of course it makes a lot of sense to run the old version.

I can even add an additional fuzzer for an older libsecp256k1 version to the OSS-Fuzz project.

(With my limited knowledge of the available resources there etc.,) this sounds interesting at least.