ossf / fuzz-introspector

Fuzz Introspector -- introspect, extend and optimise fuzzers
https://fuzz-introspector.readthedocs.io
Apache License 2.0
371 stars 54 forks source link

feature: extract data that can be used as input to fuzz engines, e.g. dictionaries, prioritised functions, etc #1

Open DavidKorczynski opened 2 years ago

DavidKorczynski commented 2 years ago

libFuzzer has the ability to prioritise fuzzing of certain functions. We should use the data from the reachability and coverage analysis to feed information back to the fuzzer about nice-to-analyse functions.

This heuristic could for example be focused around functions that if-hit will:

Navidem commented 2 years ago

Additional heuristics to identify interesting functions to fuzz can be:

DavidKorczynski commented 2 years ago

We can generalise this issue and consider it to be improvements focused around "input to fuzz engines". This could for example include features such as automated dictionary generation by way of statically analysing code and data in the target

DavidKorczynski commented 2 years ago

I added a proof-of-concept for this. The focus atm is on dictionary generation by looking at constants used by each fuzzer in its reachable functions. Sample code for extracting strings used in fuzzer-reachable functions is here: https://github.com/ossf/fuzz-introspector/blob/28d5ee4e9be42e962eb21f7b2dc4cb6fdd02a95b/llvm/lib/Transforms/Inspector/Inspector.cpp#L767-L824

I have also refactored the code post-processing code so it is easier to develop individual analyses. The goal is to facilitate a plugin-like interface that makes it easy to rapidly develop new analysis techniques that rely on data collected from fuzz-introspector. Sample code in the post processor is here: https://github.com/ossf/fuzz-introspector/blob/main/post-processing/fuzz_html.py#L604-L622

I modified the simple-example-0 to display dictionary generation. For example for the following code: https://github.com/ossf/fuzz-introspector/blob/28d5ee4e9be42e962eb21f7b2dc4cb6fdd02a95b/examples/simple-example-0/fuzzer.c#L9-L19

the automatic dictionary generator gives the suggested dictionary:

k0="FUZZCAFE "
k1="FUZZKEYWORD "

This can the be used as explicit input to fuzzers -dict=dictifile. There's more work to be done here