facebook / relay

Relay is a JavaScript framework for building data-driven React applications.
https://relay.dev
MIT License
18.41k stars 1.83k forks source link

Integration with google/oss-fuzz for continuous fuzzing #4566

Open nathaniel-brough opened 10 months ago

nathaniel-brough commented 10 months ago

Hey relay team,

I've recently become interested in relays new rust compiler. I'd like to suggest and champion an effort to set up some basic fuzz-testing and combine it with google/oss-fuzz for continuous fuzzing. I'm fully aware that you are very busy people and I don't want to overload your review/maintenance capacity. Is this a bad time to discuss potential security/reliability improvements?

If you're not familiar with fuzzing or oss-fuzz I've included a few brief notes below.

Benefits of Fuzz-Testing

Google/oss-fuzz for Continuous Fuzzing

I’d be more than happy to lead the effort in integrating fuzz testing with the relay compiler and assist in any way required.

Prior integrations

There have been a number of previous integrations completed with facebook repositories and google/oss-fuzz including;

As a proof of concept I created a super simple fuzz harness for the graphql-syntax crate in #4565

captbaritone commented 9 months ago

Thanks for suggesting this. Fuzzing is certainly interesting, and something I've often wished we could make better use of. I'd be curious to learn a bit more:

  1. What are the advantages to doing this within the Relay repo? It looks like, at first glance that one could construct and run a fuzzer on their own (perhaps with a the addition of the derived Arbitrary trait) and then simply report any issues as regular bugs.
  2. What type of bugs do we expect this to be able to catch? Inputs which would make the parser panic only?
  3. Do you have interest in taking this further? Do you see other crates that might be able to be fuzzed? If so, which ones and what type of bugs do you think they could catch?
  4. What's involved with setting up OSS-Fuzz? Does that need to be configured/requests/approved by the official maintainers?

Thanks again for bringing up the idea!

nathaniel-brough commented 9 months ago

What are the advantages to doing this within the Relay repo? It looks like, at first glance that one could construct and run a fuzzer on their own (perhaps with a the addition of the derived Arbitrary trait) and then simply report any issues as regular bugs.

You certainly could host the fuzzer elsewhere. There are a few advantages of keeping the fuzzer's in the relay repo;

  1. It's easier to keep the fuzzers in sync with API changes. In my experience of creating fuzzer's separate to the main repository the fuzz-harnesses inevitably bitrot. It's also usually the case that it's nicer to find bugs at HEAD rather than at some locked version.
  2. It allows for things like CIFuzz which will build and run the fuzzers for a short period on every pull request. This prevents shallow bugs before they are merged. This differs somewhat from the google/oss-fuzz in that oss-fuzz will build/run the latest commit nightly on a distributed cluster for a few hours per night, targeting deeper bugs.

It's also worth noting that google/oss-fuzz can be configured to automatically open issues on github when bugs are found by the fuzzers.

What type of bugs do we expect this to be able to catch? Inputs which would make the parser panic only?

There are a few different classes of bugs that these fuzzers will be able to find;

  1. As you've mentioned anywhere there is a possibility of a panic, in fact after running the fuzzer in #4565 for a couple of hours today it's started finding a lot of (non-critical) bugs. I'll open up an issue shortly with more details.
  2. Memory corruption bugs e.g. buffer overflows. While these aren't common in "safe" code, cargo-fuzz will instrument all code (including third party c/c++ -sys crates), with the address sanitizer. I haven't had a detailed look over relay's deps but memory curruption errors are more common anywhere that FFI is involved.
  3. Arithmetic errors e.g. integer overflow etc.
  4. Property testing inconsistencies. This requires a little more setup, but generally involves writing a normal fuzzer and then asserting that some invariant remains true. e.g. when fuzzing de/serializer's it's pretty common to assert that roundtrip serialisation->deserialisation get's you back to the original. Something like;
fn fuzz_roundtrip(input: &str) -> Result<()> {
  let decoded = decode(input)?;
  let encoded = encode(decoded)?;
  assert_eq!(input, encoded);
  Ok(())
}

Do you have interest in taking this further? Do you see other crates that might be able to be fuzzed? If so, which ones and what type of bugs do you think they could catch?

Yeah for sure. I would break this down into two major stages;

  1. Fuzz the compiler, with a cursory search here are the crates I'd consider to be low hanging fruit for fuzzing;
    • docblock-syntax: Seems trivial to fuzz in a similiar manner to graphql-syntax. Most likely would just expect to find ways to crash the parser.
    • graphql-ir: Similar to above ^^
    • schema: Looks like a pretty easy and involve a minor layer on top of graphql-syntax crate.
    • ... the other crates are definitely fuzz-able, but will likely involve more intrusive changes.
  2. Fuzz the javascript side (longer term) I'm not as familiar with fuzzing in javascript, so this might take me a little longer. Javascript fuzzer's can also be integrated into oss-fuzz.

What's involved with setting up OSS-Fuzz? Does that need to be configured/requests/approved by the official maintainers?

There is an application process, which I can complete on your behalf, in order to do so I'll need a few things from you;

I'm fairly confident given the popularity of this repository that it'd be accepted, but there is a non-zero chance that it could be rejected. Typically they will accept all repositories with a criticality_score > 0.70, relay has a score of 0.74.

There's a lot of information there, so feel free to ask more questions/clarify things :)

captbaritone commented 9 months ago

Thanks for the response! Maybe we can start with the bugs you already found and see if it feel like the kind of bug reports that provide value? I'm a little concerned that this is going to add complexity/fagility and result in bugs that, while technically bugs, are more of a distraction than helpful.

That said, perhaps I'm underestimating what kinds of issues we might find via property testing? But then again, that obviously involves more complexity/fragility as well.

nathaniel-brough commented 9 months ago

Thanks for the response! Maybe we can start with the bugs you already found and see if it feel like the kind of bug reports that provide value? I'm a little concerned that this is going to add complexity/fagility and result in bugs that, while technically bugs, are more of a distraction than helpful.

Sounds good! I'm also happy to fine tune these fuzzers as we go until I get them to a point where it's mostly reporting interesting bugs.

That said, perhaps I'm underestimating what kinds of issues we might find via property testing? But then again, that obviously involves more complexity/fragility as well.

Yeah property testing is really useful for finding logical bugs. But it requires an in depth understanding of the code and intended design to ensure that you are actually writing property tests that are correct. So typically I'll start with a fuzzer that just tries to make things crash or trigger memory corruption, then I'll follow up with property testing fuzzer's once I'm more familiar with the code base. There is certainly some additional complexity here, but would you mind expanding on what you mean by fragility in this context?

My experience is that fuzz-testing harnesses are generally more robust than unit-tests simply because they are usually less tightly-coupled than a typical unit test. That's not to say fuzz harnesses are better/worse than unit-tests, more that they are complimentary and have different strengths/weaknesses. But I get the feeling that you might be referring to fragility in some other context.

captbaritone commented 9 months ago

Great! Looking forward to any bugs you're able to uncover and going from there!

would you mind expanding on what you mean by fragility in this context?

Mostly that the property test assertions/invariants would be tightly coupled to the phases they validate. If we want to evolve those structures, we would also need to make changes to the property tests or risk them breaking. It just adds more overhead to some set of changes. Probably these changes are infrequent, and perhaps the changes to the property tests would be simple? I'm trying to weigh the cost of that overhead against the unknown value of the signal it might provide.

Thanks for being open to an incremental approach! Looking forward to exploring this with you!

nathaniel-brough commented 9 months ago

So after analyzing all 333 crashes, and de-duplicating stack traces (which oss-fuzz does automatically) there where a total of 3 bugs found by the fuzzer. None of which look like critical issues, but all of which seem to be logical issues, that could potentially cause headaches at some point if not fixed:)