rust-fuzz / libfuzzer

Rust bindings and utilities for LLVM’s libFuzzer
Apache License 2.0
206 stars 44 forks source link

Gaps in rustc for effective fuzzing #92

Closed landaire closed 2 years ago

landaire commented 2 years ago

Hi folks,

I've been writing a Rust-based replacement for libFuzzer called Fazi, and I've encountered a couple of issues that have some crossover with libFuzzer usage and the potential for Fazi (or any Rust-based library) to replace libfuzzer for fuzzing rust code.

I apologize for this issue not strictly relating to this crate, but this is the best place I could think of for such discussion.

Issues with SanCov

You can compile a Rust crate with sancov instrumentation via:

$ cargo rustc -- \
    -C passes=sancov-module \
    -C llvm-args=-sanitizer-coverage-level=4 \
    -C llvm-args=-sanitizer-coverage-inline-8bit-counters \
    -C llvm-args=-sanitizer-coverage-trace-compares \
    -C llvm-args=-sanitizer-coverage-pc-table

When compiling a Rust binary that depends on another Rust library for fuzzing, we now have a scenario where everything in the Rust world is instrumented, including the fuzzing library itself. Even if I could mark the fuzzing crate as #[no_sanitize(coverage)], its dependencies (including stdlib) would likely still be instrumented.

This introduces at least two problems I encountered:

  1. Deadlocks from __sanitizer_cov_trace_{const_}_cmp* functions being executed while a lock is held within the fuzzing code
  2. Coverage inflation from the fuzzing library itself

Cannot use function interceptors

libFuzzer supports hooking a collection of builtin functions such as memcmp, strcmp, etc. by implicitly adding the-fno-builtin-memcmp argument when -fsanitize=fuzzer or -fsanitize=fuzzer-no-link are provided to the clang frontend.

As far as I can tell, disabling these builtins is only possible via clang and not supported with an LLVM argument . Support for this would likely fix issues such as #90 as the interceptors could be used.


So now for the questions I wanted to discuss:

  1. Have these issues been considered before?
  2. Are there any known workarounds that could work for either?
  3. If there are no known workarounds, should I just file an issue against Rust?
nagisa commented 2 years ago

I'm not sure I see a problem with the first point in this issue. Why is it a problem to compile library containing instrumentation separately, without the flags for instrumenting it and link it into the binary and/or libraries later on? That is pretty much exactly what would be happening in a C(++) project that uses a fuzzer implemented in C(++) as well, and solves precisely the same problem. The only thing you'd need to work around is cargo, possibly by building the instrumentation in a build.rs.


As for the second point, I'm not sure memcmp and strcmp has a significant semantic meaning for Rust, these are bog-standard foreign functions. memcpy sort-of does have some special handling in that LLVM tends to replace open-coded copy loops with it, but this is the only (relevant) one AFAIR.

It sounds to me that for Rust a fuzzer implementation will want a way to replace libstd/libcore in its entirety, which sounds hard, but also doesn't look avoidable due to just how many ways there are to copy and compare stuff in a different ways, due to monomorphization.

landaire commented 2 years ago

Why is it a problem to compile library containing instrumentation separately, without the flags for instrumenting it and link it into the binary and/or libraries later on?

It's not necessarily an issue, but creates a semi-awkward build step. I suppose the solution here is to use a build.rs script that builds source similar how this crate builds libfuzzer-sys.


As for the second point, I'm not sure memcmp and strcmp has a significant semantic meaning for Rust, these are bog-standard foreign functions.

It looks like my assumption here was wrong. I was concerned that there'd be no way to tell LLVM not to use an instristic memcmp() or to do some kind of special linking, but it appears as though an -fno-builtin-memcmp equivalent isn't required as my PoC works: https://github.com/landaire/fazi/commit/78efda651a46073240bdb8e28e083181ec8299f1.

Since the second point was my big concern that applied to this repo as well, I'm going to go ahead and close this issue. Thanks for letting me talk this out and rubber duck debug of sorts :)