google / silifuzz

Apache License 2.0
380 stars 25 forks source link

Add a script to collect corpus from fuzzing results for real CPU tests #1

Closed Maknee closed 1 year ago

Maknee commented 1 year ago

Hi Silifuzz authors,

This PR adds a script for transforming the generated corpuses from fuzzing into Silifuzz corpuses, in order to run the fuzzed code on a real CPU core.

The feature is currently a TODO in the section for collecting the corpus from the fuzzing results. I didn't find an available script in the repo can collect the corpus, so I write myself.

In detail, this pull request has a python script that loops over the work folder that contains fuzzing results generated by the Centipede fuzzer and does the following:

  1. Parse the file to gather all the instructions
  2. Run the fuzz_filter_tool on with each set of instructions and outputs snapshots to a temp directory
  3. After all the snapshots are generated, run the snap_tool to generate a relocatable corpus and a tar of the corpus file that can be ran by silifuzz_orchestrator_main

The python script does not have external module dependencies. Dependencies by the python script include only the silifuzz tools and tar.

google-cla[bot] commented 1 year ago

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

ksteuck commented 1 year ago

Hi Henry!

Thank you for taking the time and effort to put this together. As you noted, this is currently a TODO for us and we are actively working on addressing it. Specifically, we are building an end-to-end example that would run fuzzing+corpus creation using Unicorn. You can see some of the groundwork in https://github.com/google/silifuzz/commit/6b5906595ba9c6b4e607960b5c5a55ec597d5eb3 and there's more to come soon. Stay tuned.