memorysafety / zlib-rs

A safer zlib
zlib License
74 stars 6 forks source link

Consider respecting cfg!(fuzzing) #115

Open fintelia opened 2 weeks ago

fintelia commented 2 weeks ago

The cfg!(fuzzing) flag is designed as a way to disable checksums validation and similar checks when Rust code is run under fuzzing. This would be helpful to enable differential fuzzing between the many existing Rust zlib implementations.

folkertdev commented 2 weeks ago

I don't really follow why calculating checksums would be a problem for fuzzing? the checksum is entirely deterministic, and other implementations should generate the same checksum for the same input bytes. Can you elaborate on what exactly we should do differently when cfg!(fuzzing) is enabled?

fintelia commented 2 weeks ago

This would be for the inflate codepath. There the fuzzer would be generating random sequences of bytes and feeding them to the library as zlib bitstreams. Most (at least initially) would be corrupt for one reason or another, but over time the fuzzer would start to find bitstreams that were valid other than the checksum. Rather than forcing the fuzzer to also look for valid checksums, you can instead have the decompressor return success even when the checksum doesn't match.

This is where miniz_oxide ignores the checksum and here is a fuzz test that found multiple bugs in fdeflate and miniz_oxide by comparing their behavior to each other.

folkertdev commented 2 weeks ago

that's interesting. I think a lot of code paths are already covered by the raw (so, no header, no footer) deflate stream but control flow does branch on the raw/zlib/gzip mode.

So, I just naively tried this, and it doesn't seem to do anything?

> cargo +nightly fuzz run inflate_random_bytes

warning: unexpected `cfg` condition name: `fuzzing`
   --> /home/folkertdev/rust/zlib-rs/zlib-rs/src/inflate.rs:901:22
    |
901 |             if !cfg!(fuzzing) {
    |                      ^^^^^^^

so, there must be more to it than that? But the docs suggest to me that the above should just work.

fintelia commented 2 weeks ago

That is unfortunately expected on recent nightlies. The compiler now "helpfully" prints that even though the fuzzer enables the cfg!(fuzzing) flag

folkertdev commented 2 weeks ago

It turns out this is not as easy as it seems.

We currently test against zlib-ng in compat mode (so it has the api of the original zlib). We aim to be a drop-in replacement for zlib, so matching its behavior is important.

In our uncompress_random_input fuzzer, we compare our output with that of zlib-ng. The api here does not give any error information beyond the return code integer. Specifically, we cannot know whether a DataError means that the checksum was incorrect, or that something else failed.

So if under fuzzing we allow incorrect checksums, that fuzzer will now no longer detect the case where zlib-ng returns a DataError for some other reason, and we just accept the input.

Is fuzzing against zlib-rs something you're interested in for fdeflate? if that is the actual usecase we could add some __unsafe_skip_checksum_check flag which your fuzzer could use.

fintelia commented 1 week ago

That is tricky to deal with. I don't specifically need this for testing fdeflate, so feel free to close

I would recommend designing an interface that exposes the precise error cause. Probably isn't that helpful for end users, but it can make fuzzing differences easier to root cause.