Open MarekKnapek opened 1 week ago
Thanks! What would you suggest as a way to do that? Set up a manually invoked GitHub Action or similar that can be invoked from time to time, which successively invokes cppfront with fuzzed inputs and at the end opens one issue containing the list of all inputs that caused crashes?
I have multiple ideas. In no particular order:
cppfront
to suppress all on screen output. The output could be quite noisy and, during fuzzing, the output is not very useful. Something like cppfront /quiet test.cpp2
. It could be -q
, --quiet
, /quiet
or something similar. Or, it could be environment variable, or compile-time option.cppfront
source code to make fuzzing easier. Most importantly, do not write anything to disk, do not read from disk. I believe this is better for in-process fuzzing style the libFuzzer
provides. The situation with AFL
could be different tho. Basically convert cppfront
from an application to a library, then build two applications from this library, one is the cppfront
itself, the other is a fuzzer. The library would accept inputs and outputs as run-time or compile-time types. In cppfont
mode, the inputs and outputs would be files on disk and command line parameters. In fuzz mode, the inputs and outputs would be memory buffers containing the input sources, options and place for an output.libFuzzer
has an option to accumulate something called a corpus
over time. It would be nice not to lose this corpus
and maintain it over time. Maybe force-committing it periodically to separate branch?libFuzzer
provides random buffer of bytes and it is up to the application what it does with it, I choose to shove it to cppfront
as input source file. It would be nice to identify various separate independent components of cppfront
, "deserialize" this random buffer to something meaningful for each component, execute that component with that input and watch for undefined behavior, use after free, out of bounds access, assert and other bugs to trigger.cppfront
is fuzz-tested, that it contains no bugs when processing random or malformed input, the next stage would be verifying cppfont
that it produces valid output. Meaning fot any input, it produces not only no UB, assert, out of bounds access, but also it produces an error message or valid C++ output. It never produces invalid C++ as its output. This would be verified by running already installed compiler on cppfront
's output and testing the compiler's exit code. But I believe this would be very slow without custom mutator. Mutators is separate whole new can of worms. What mutator does is that it parses input (from corpus, here the corpus might not be random, but series of valid cpp2 source files) to its own internal representation, somehow modifies this internal representation, writes this internal representation back to bytes, then exercises the fuzz target as usual. This method is more difficult for fuzz test author, but yields better fuzz speed and coverage than supplying random bytes as fuzz target input.Thanks for the ideas.
Re /quiet
: This was added recently, with the semantics that only error output is printed. If cppfront crashes before the final stage of emitting errors, nothing will be emitted.
Issues found by fuzzing so far:
I'm using this code to fuzz: https://github.com/MarekKnapek/cppfront/commits/fuzz3/ it could be improved, but i don't know how.