Continuous fuzzing by way of OSS-Fuzz

DavidKorczynski commented 2 years ago

Hi,

I was wondering if you would like to integrate continuous fuzzing by way of OSS-Fuzz? In this PR https://github.com/google/oss-fuzz/pull/8105 I do exactly that, namely created the necessary logic from an OSS-Fuzz perspective.

Essentially, OSS-Fuzz is a free service run by Google that performs continuous fuzzing of important open source projects. The only expectation of integrating into OSS-Fuzz is that bugs will be fixed. This is not a "hard" requirement in that no one enforces this and the main point is if bugs are not fixed then it is a waste of resources to run the fuzzers, which we would like to avoid.

If you would like to integrate, the only thing I need is as list of email(s) that will get access to the data produced by OSS-Fuzz, such as bug reports, coverage reports and more stats. Notice the emails affiliated with the project will be public in the OSS-Fuzz repo, as they will be part of a configuration file.

In the event your unfamiliar with fuzzing, then it's a technique used to automate test case generation. It's been used frequently over the last decade to analyse projects in memory unsafe languages to catch memory corruption issues, but is now moving into supporting memory safe languages (hence this PR). In the Python world, the expected bugs to be found at the moment is uncaught exceptions. I'm happy to answer any questions you may have!

decalage2 commented 2 years ago

Hi David, this looks like a good idea. Where can I see the zip file containing the corpus of files used for fuzzing? Also I see that the script to fuzz olefile only opens each data file but does not do any further action. Maybe it would be better to call more olefile methods, for example get the list of streams, and open/read each of them? And also read all the OLE properties (as this part of the code is less tested).

DavidKorczynski commented 2 years ago

Where can I see the zip file containing the corpus of files used for fuzzing?

You will get this on https://oss-fuzz.com once the integration has happened

Also I see that the script to fuzz olefile only opens each data file but does not do any further action. Maybe it would be better to call more olefile methods, for example get the list of streams, and open/read each of them? And also read all the OLE properties (as this part of the code is less tested).

Definitely. This first fuzzer has some findings so we could start with that and you'll experience OSS-Fuzz. What we can also do is move the fuzzers upstream to here in the olefile repository and then you can add/modify the fuzzers however you like.

decalage2 / olefile

Continuous fuzzing by way of OSS-Fuzz #149