silentbicycle / theft

property-based testing for C: generate input to find obscure bugs, then reduce to minimal failing input
ISC License
611 stars 31 forks source link

Coverage-guided generation #43

Open silentbicycle opened 6 years ago

silentbicycle commented 6 years ago

Part of what makes afl-fuzz and libfuzzer so effective is their branch coverage feedback loop. Most of theft's heuristics are focused on shrinking failures it finds, but until it finds a failure, it's just randomly jumping around the state space. We can do better.

While theft shouldn't duplicate the work afl does to instrument programs at compile time, and I don't want theft to be coupled to / depend on any particular compiler, it should be able to make use of coverage info (from any source). The SanitizerCoverage interface in clang seems like a good example coverage source -- theft could have an optional hook, called with an opaque uint32_t identifier for a branch ID whenever execution branches, and the sequence of these (likely in a ring buffer) would be an abstract representation of a particular trial's execution path. If the path is significantly different from previous paths, then the generator could copy and mutate the random bit stream until the trail goes cold, rather than completely replacing it each trial. Unlike shrinking, these mutations would not need to be monotonic.

This interface shouldn't be clang-specific -- it could also be informed by a macro that uses __LINE__, the line hook in Lua's debug interface, etc.

silentbicycle commented 6 years ago

I have a proof of concept for using the SanCov interface, in exactly the way I'd want as a coverage source for theft. This should wait until multi-core is done, though; it's probably a good primary feature for the next release after that.