onflow / cadence

Cadence, the resource-oriented smart contract programming language 🏃‍♂️
https://developers.flow.com/cadence
Apache License 2.0
531 stars 139 forks source link

Fuzzing Testing #1309

Closed robert-e-davidson3 closed 2 years ago

robert-e-davidson3 commented 2 years ago

The goal of fuzzing is to quickly increase test coverage. Concretely this means fuzzing can run for a long time (say, a week) without finding any new bugs.

Our original fuzzing was quick and dirty: feed some bytes to the parser and fail on panics. The search space of this approach is so broad that it cannot run fast enough to provide much test coverage - almost every input it gives is syntactically invalid. This code lives in onflow/cadence and can be ignored.

We needed a grammar-based fuzzer to cut down the search space to just syntactically-valid inputs. The first idea was to write Cadence's grammar into the EBNF format so we could leverage existing tools. Unfortunately Cadence doesn't fit into EBNF cleanly enough to satisfy those tools.

Joe had a solution: modify the Cadence parser so that, at any point, it can give a list of valid tokens that can follow. This "parser-based fuzzer" is an implicit grammar-based fuzzer. That code lives in onflow/fuzzer as an old fork of onflow-cadence and was run on FuzzBuzz.io.

The next step is to evaluate the status of the onflow/fuzzer code and how it integrates with FuzzBuzz.io. Once that's established, rebase the latest Cadence code from onflow/cadence:master to onflow/fuzzer:master and make adjustments so the fuzzer still works.

This epic ends there. However, if that isn't sufficient then there are two additional approaches to take. They're described here for future reference.

We may want to use go-fuzz or the built-in fuzzing support in Go 1.18. The approach here is to use mainnet contracts and transactions as the seed corpus. That will require maintenance because as the Cadence language is developed and so diverges from the deployed contracts.

If that approach isn't sufficient then we need to investigate writing and integrating our own mutation algorithm. Most likely this would merge both prior methods, mutating existing contracts and transactions randomly but based on the existing contracts.

robert-e-davidson3 commented 2 years ago

This ticket supersedes several stale tickets:

turbolent commented 2 years ago

Next steps: