Fixed Fuzzer, added Seed Corpus and Dictionary

iamarshsingh commented 4 years ago

Hi,

There is a small bug in the lodepng_fuzzer.cpp code in lines 62 and 63, due to which whenever any input is passed for decoding, it is never tested positive as a png file and always gives error. Due to this, the lines between 85-91 are never executed and thus the functions in them are never tested.

lodepng norun

This can also be observed in the oss-fuzz reports which you can find here (dated: 06/11/2020):

Coverage Report lodepng

By removing those two lines and adding a dictionary and seeds of png images. The code coverage increased by more than 10%. For testing, I ran the modified fuzzer along with some seed pngs and a dictionary, locally for 10 Million runs (< 5 minutes on single core execution), then the following results were obtained:

Local Testing Final

The coverage has increased by ~8% in all categories.

I have also added some seed png images and a dictionary for png files, all of which is open source and free to use. Links are available here which you can check:

Seed PNGs Link 1: https://lcamtuf.coredump.cx/afl/demo/
Seed PNGs Link 2: https://github.com/glennrp/libpng/tree/libpng16/contrib/pngsuite
Dictionary: https://github.com/glennrp/libpng/blob/libpng16/contrib/oss-fuzz/png.dict

If the above changes looks good, we can make the update the script in oss-fuzz repository to support dictionary and seed corpus.

Summary of File Changes:

Fixed a Bug in the Fuzzer due to which not even all inputs were considered non-png.
Added Seed corpus for the fuzzer.
Added Dictionary for the fuzzer.

iamarshsingh commented 4 years ago

@lvandeve Please let me know if there any changes required. I believe more fuzz targets can also be added to the project for better coverage.

iamarshsingh commented 4 years ago

@mbarbella-chromium Thank you for your response. Sure, we can decrease the number of seed inputs. How much files would you recommended? I can make a random mix of both of them, valid as well as invalid pngs.

mbarbella-chromium commented 4 years ago

Low hundreds rather than thousands would be a good upper bound, but the smaller you can get the seed corpus without compromising coverage the better.

iamarshsingh commented 4 years ago

Sure @mbarbella-chromium I will try to choose some good subset of the files and update the PR.

oliverchang commented 4 years ago

Sure @mbarbella-chromium I will try to choose some good subset of the files and update the PR.

Alternatively, pull this in during the OSS-Fuzz build process rather than uploading a lot of binary files to this repo.

iamarshsingh commented 4 years ago

@oliverchang Yes, I have transformed the seed corpus to gs://oss-fuzz-corpus/lodepng and will directly download it from there in the build file of oss-fuzz for lodepng. I will make these changes in the oss-fuzz repository.

I also tested setting the random_color variable to a hash of the input as well as setting it to the last byte of the input. The results seems to be better in case of setting it as the last byte as libFuzzer have more control over setting the byte. We can also modify the corpus as @mbarbella-chromium mentioned but that seems like a bigger task and if we would like to add more seed corpus, we would have to do the same for it. The result of setting this seems preety good:

Coverage Result

The coverage improved further by ~6% in all categories and about ~14% from the original fuzzer's coverage. Please have a look and suggested any further changes.

iamarshsingh commented 4 years ago

Hi, I have incorporated the required changes:

The seeds have been uploaded to: http://oss-fuzz-corpus.storage.googleapis.com/lodepng/lodepng_fuzzer_seed_corpus.zip and now they will directly be downloaded by the Dockerfile in oss-fuzz.
The Dictionary have been transferred to oss-fuzz repository and will be used directly.

If there are any other changes required, please let me know.

lvandeve / lodepng

Fixed Fuzzer, added Seed Corpus and Dictionary #130