ethereum / hevm

symbolic EVM evaluator
https://hevm.dev
GNU Affero General Public License v3.0
235 stars 48 forks source link

High-performance Concrete Fuzzing Taking Advantage of Expr #383

Open msooseth opened 1 year ago

msooseth commented 1 year ago

Original Idea (already implemented)

There are some hard cases where running concrete execution for a little bit will get us a win. For example:

    function prove_distributivity(uint120 x, uint120 y, uint120 z) public pure {
        assert(x + (y * z) == (x + y) * (x + z));
    }

In the benchmark suite https://github.com/eth-sc-comp/benchmarks/ will take forever for an SMT solver to deal with, but a fuzzer should find a counterexample in <1s.

This sounds like a hack, and in some sense it is, but I can very easily see this being helpful to the users. The perceived utility of HEVM would be higher, and that's all that matters.

We could even run a thread that just does fuzzing, while the other threads do the symbolic interpretation, etc.

Follow-ups

The original idea as per above has been implemented already, see Fuzz.hs. However, as detailed below by d-xo, there are a number of significant improvements that can be done that could improve the performance in very significant ways.

d-xo commented 1 year ago

I think a hybrid fuzzing mode will be extremely powerful. The dream setup I've always had in my head is as follows:

  1. summarize the contract under test
  2. try to get models that can hit every branch
  3. use any models as seeds for a dictionary
  4. fuzz by evaluating the summary directly (this should be a lot faster than actually running the evm), use standard mutation heuristics on the seed in the dictionary. Fallback to direct evm execution for Partial branches (perhaps even give Partial nodes enough data to be able to directly resume execution in a full interpreter).

steps 2,3, and 4 can be run in parallel. I am fairly confident that this approach would be very competitive with the best available evm fuzzers (even with the perf hit that we pay for using haskell).

d-xo commented 1 year ago

for ultimate fuzzer perf we could consider applying a futamura projection and compiling Expr down to c or x86 assembly and executing that directly. This would also allow us to make use of e.g. afl for instrumented fuzzing, and use the symbolic exec features in hevm to to seed the afl dictionary with interesting values.

https://github.com/uni-due-syssec/efcf-framework