benoitc / gunicorn

gunicorn 'Green Unicorn' is a WSGI HTTP Server for UNIX, fast clients and sleepy applications.
http://www.gunicorn.org
Other
9.64k stars 1.74k forks source link

Continuous fuzzing by way of OSS-Fuzz #2829

Open DavidKorczynski opened 1 year ago

DavidKorczynski commented 1 year ago

Hi,

I was wondering if you would like to integrate continuous fuzzing by way of OSS-Fuzz? Fuzzing is a way to automate test-case generation and has been heavily used for memory unsafe languages. Recently efforts have been put into fuzzing memory safe languages and Python is one of the languages that has recently gotten better tool support. Fuzzing Python aims to primarily find uncaught exception bugs at the moment, and in the future hopefully more security-aware bug oracles will be used as well.

In this https://github.com/google/oss-fuzz/pull/7921 we did an initial integration into OSS-Fuzz. OSS-Fuzz is a free service run by Google that performs continuous fuzzing of important open source projects. If you would like to integrate, the only thing I need is a list of email(s) that will get access to the data produced by OSS-Fuzz, such as bug reports, coverage reports and more stats. Notice the emails affiliated with the project will be public in the OSS-Fuzz repo, as they will be part of a configuration file.

benoitc commented 1 year ago

Can you describe how it will works? Can it works without all the cloudy infrastructure?

benoitc commented 1 year ago

I would have love to hear how this can be used. @DavidKorczynski please provides some information.

DavidKorczynski commented 1 year ago

I would have love to hear how this can be used. @DavidKorczynski please provides some information.

Yes! Will get back to you on this tomorrow!

DavidKorczynski commented 1 year ago

Can you describe how it will works? https://github.com/google/oss-fuzz is a service for running fuzzers continuously. A fuzzer is similar to a unit test but rather than testing specific inputs relies on a simple byte buffer that is, by the developer, converted into appropriate types for the target (gunicorn). The fuzzing engine used by oss-fuzz https://github.com/google/atheris -- there are several examples on this repository. Fundamentally, fuzzing relies on genetic mutational algorithms to explore the target code, assuming the fuzzer is written to be able to reach a lot of the target code.

The idea for Python is at the moment to identify if any uncaught exceptions exist. See e.g. this Github issue for an example found by OSS-Fuzz: https://github.com/sdispater/tomlkit/issues/276

We added a fuzzer to gunicorn utility functions that you can see here: https://github.com/google/oss-fuzz/blob/master/projects/gunicorn/fuzz_util.py If you're interested, then I can add your email to the project.yaml file in OSS-Fuzz such that you can see any found issues in gunicorn, and you can get a sense of what this fuzzer has found? I can add you as the primary contact in https://github.com/google/oss-fuzz/blob/master/projects/gunicorn/project.yaml and then OSS-Fuzz will send emails to you when any bug is found. There are at the moment a few uncaught exceptions -- alternatively I could also make Github issues similar to the one I link above and you can assess if the issues are helpful?

For me, the ideal goal would be if you could use fuzzing to test gunicorn, that is, if you could write harnesses similar to the one I linked. OSS-Fuzz comes with built-in code coverage so you are able to see how much of the code is being explored, and what is not.

I hope this gave a bit more sense to it -- am happy to more in-depth if you have further questions.