Ability to fuzz wasm-compiled contracts

brson commented 1 year ago

What problem does your feature solve?

Fuzz testing currently requires compiling contracts to native code. Wasm code is a black box to the fuzzer and it gets no feedback from it.

For the most part, developers can adapt and use natively compiled contracts. But what if they want to fuzz contracts they don't have the source to? Maybe their contracts call third party contracts that are wasm only.

What would you like to see?

Compiling a program for fuzzing creates undocumented global tables that get frobbed by the instrumented natively-compiled code. wasmi can be modified to also frob these tables in the same way as LLVM does.

I used to have a link rust definitions of these tables, but seem to have lost it. fitzgen can help understand the technical details.

This would be some hacky code and maybe not worth the maintenance burden.

What alternatives are there?

Just don't do it.

brson commented 9 months ago

https://github.com/rust-fuzz/sancov interoperates with llvms sanitizercoverage/libfuzzer. wasmi can use it to e.g. insert instrumentation on branches.

brson commented 8 months ago

I pinged fitzgen about this subject and he said that although he wrote the sancov bindings, he could not actually get them to work. He was trying to dynamically create counters and libfuzzer would not see them.

So if we want to try to do this, the starting point will be those sancov bidings, but we're on our own.

brson commented 8 months ago

@graydon did successesfully interop with sancov in this project: https://github.com/graydon/photesthesis/blob/main/src/test.cpp

So there's some more code to crossreference.

Nick said if we get the sancov bindings to do anything useful he'd be happy to take contributions.

brson commented 7 months ago

I have done some initial experiments toward making wasm fuzzable with cargo-fuzz / libfuzzer, and understand better the basic problems that need to be solved.

I think it is doable, but the effort is significant.

We'll end up doing at least the following:

modifying wasmi to register branches and function entries for every wasm module, and increment counters when instances hit them
implementing a custom __sanitizer_symbolize_pc function to symbolize both native and wasm function names
implementing a soroban fuzz command to either wrap or replace cargo-fuzz (because cargo-fuzz will need to be invoked with --sanitizer=none to avoid linker errors).

There are three basic components that collaborate to fuzz LLVM-compiled code:

The instrumented code. Emitted by LLVM. This does several things: first it emits setup calls to both __sanitizer_cov_8bit_counters_init and __sanitizer_cov_pcs_init. the 8 bit counters indicate visits to PCs (program counters), and the the PCs describe code locations. Both these functions must be called with the same number of entries or the fuzzer won't work - this may be why fitzgen never got the sancov crate to do anything useful. When collecting fuzzing information, libfuzzer cross-references the PCs for incremented counters to do things like symbolicate function addresses. The instrumented code also increments the counters on branches.

Libfuzzer. This does probably too many things. Primarily it implements __sanitizer_cov_8bit_counters_init and __sanitizer_cov_pcs_init and tracks the coverage. It occassionally calls __sanitizer_symbolize_pc and other sanitizer functions to symbolicate addresses, etc. It implements a GUI that prints coverage information to the terminal. I think it implements the default mutator that chooses the next input bytes.

Some sanitizer. The sanitizers all seem to implement common __sanitizer_* functions which are called by libfuzzer. This is why e.g. in #1056 we were able to work around a bug on macos by mysteriously using thread sanitizer instead of address sanitizer - they both provide the same common functions. Of particular interest is __sanitizer_symbolize_pc which turns a PC into a function name for display, a very gnarly system-dependent operation. On linix at least the fuzzer seems to be able to operate, with degraded capabilities, with no sanitizer at all (passing --sanitizer=none to cargo-fuzz) - all the sanitizer functions are "weak".

The big problem we are going to run into is that these components are designed with the expectation that PCs live in the address space of the running program; but with the wasm interepreter we may have many running programs inside the native running program. The big implication of this is that the existing sanitizers are not sufficient to symbolicate our PCs; but also we'll need to come up with a scheme to distinguish between PCs of the native program and PCs of (multiple) instances of wasm programs.

So we'll probably have to write a new library that implements the sanitizer functions. Linking to a different sanitizer library requires a more complex invocation of cargo-fuzz, with the --fsanitizer=none flag, which is probably reason enough to bury it in a custom soroban fuzz subcommand.

Rust fuzzing is usually done with libfuzzer-sys which vendors its own copy of libfuzzer. It may not be strictly necessary to fork libfuzzer if we can come up with a way of encoding wasm PCs in a way that is compatible with libfuzzer; but we may also find that we either need to fork it to support wasm PCs, or want to fork it to e.g. improve the GUI experience.

brson commented 7 months ago

The fuzzer also needs to call into the sanitizer to display backtraces. I haven't looked at exactly how it does this, but it's another function we'll need to override to handle wasm frames. Probably quite difficult because it will need help from soroban-env-host to understand the interleaving of wasm frames across modules as well as native frames.

brson commented 7 months ago

The backtrace printing looks hard indeed, but might not be necessary to implement. The main place backtraces are needed is to show where a failure occurred, and these appear to be printed from a signal handler with no additional context, right before the process is terminated.

At the moment, wasm code can't trigger the kind of panic that fails the fuzzer and requires a backtrace to be printed - instead the test harness calls into a contract and interprets what the contract did, then the harness can decide to panic.

In the future it might be desirable for certain types of errors produced by contracts to trigger a fuzzer failure - e.g. if a contract does a raw panic without an error code. Then the fuzzer could immediately exit with a backtrace that included wasm frames, which could be more useful than the test harness failing the test after the fact.

brson commented 6 months ago

I'm doing prototyping of this project in https://github.com/brson/soroban-wasm-fuzz-test

stellar / rs-soroban-sdk