google / oss-fuzz

OSS-Fuzz - continuous fuzzing for open source software.
https://google.github.io/oss-fuzz
Apache License 2.0
10.48k stars 2.22k forks source link

[flac] Why does my coverage decrease? #9928

Closed ktmf01 closed 1 year ago

ktmf01 commented 1 year ago

Hi,

I've recently added a fuzzer (fuzzer_tool_flac) that seems to be blocked, but I am unable to diagnose why. One particular symptom is that coverage seems to randomly vanish for no reason.

I haven't changed anything recently, last change (to code that doesn't even compile with the fuzzer) was March 11th. The last relevant change was March 9th: https://github.com/xiph/flac/commits/master

Relevant line coverage for March 12th was 20.18% Relevant line coverage for March 13th was 20.67% Relevant line coverage for March 14th was 20.67% Relevant line coverage for March 15th was 13.62%

Could it be that somehow corpus files are removed that shouldn't be removed on clean-up?

DavidKorczynski commented 1 year ago

Hmm. Indeed the fuzzer_tool_flac fuzzer has better coverage 12th March in comparison to 16th March (in the following fewer reds --> better coverage). On the 12th March, 2811 callsites are "red": https://storage.googleapis.com/oss-fuzz-introspector/flac/inspector-report/20230312/fuzz_report.html#Fuzzer:-fuzzer_tool_flac 12th-march

On the 16th March, 2924 callsites are "red": https://storage.googleapis.com/oss-fuzz-introspector/flac/inspector-report/20230316/fuzz_report.html#Fuzzer:-fuzzer_tool_flac 16th-march

Is there potentially something in the code that the fuzzer targets that is stateful or non-deterministic? When the code coverage is collected it runs the fuzzer on the corpus files at once and is the behaviour of the target code going to be identical independently of the ordering in which the seeds are run?

When I look at the calltrees of fuzzer_tool_flac most differences in the callsites' coverage are around callsite number 600: 12th March and 16th March. When I look at the code, a lot of it seems to do with option_values, could the state of this struct carry over?

In the init_options I see a lot of defaults are set, however, one field is not touched in this function show_version field. On line https://storage.googleapis.com/oss-fuzz-coverage/flac/reports-by-target/20230316/fuzzer_tool_flac/linux/src/flac/src/flac/main.c.html#L362 there is a check if (option_values.show_version) an early return will be done. Thus, if the field is carried over and gets set, future runs will not move past this point. Interestingly, this particular line has more hits March 16th than March 12th -- this hints that the field was set earlier in the coverage collection process causing corpus ran into this early return, which would cause the coverage to be lower since the seeds are not exploring what "they're supposed to".

The missing default setting of optional_values.show_version and the early return of the conditional if (option_values.show_version) is likely the cause of the code coverage difference, and will likely also cause disruption to the corpus minimisation process.

ktmf01 commented 1 year ago

Thanks for your thorough search! I have been searching for a while but haven't checked statefulness. I will dive into this, many thanks for the pointer.

ktmf01 commented 1 year ago

The missing default setting of optional_values.show_version and the early return of the conditional if (option_values.show_version) is likely the cause of the code coverage difference, and will likely also cause disruption to the corpus minimisation process.

This definitely does the trick. Thanks!

DavidKorczynski commented 1 year ago

This definitely does the trick. Thanks!

Awesome @ktmf01 !

For reference, I think this is something we could flag automatically -- that a certain seed has different coverage in different runs. I wonder if it would be nice to have some form of statefullness detector. Maybe I'll try to check if more projects have volatile coverage.