apache / logging-log4cxx

Apache Log4cxx is a C++ port of Apache Log4j
http://logging.apache.org/log4cxx
Apache License 2.0
278 stars 122 forks source link

Google OSS-Fuzz integration #411

Closed vy closed 1 month ago

vy commented 2 months ago

As a deliverable of apache/logging-log4j2#2891 and apache/logging-log4j2#2892, this PR implements fuzz tests along with Google OSS-Fuzz integration. See the added fuzzing.md for details.

Note: Review request has been submitted in this dev@ post.

rm5248 commented 2 months ago

This is looking good! I won't have a chance to run this myself for a few days, but it looks like I can run it manually? What about running it as part of our github actions?

vy commented 2 months ago

This is looking good! I won't have a chance to run this myself for a few days, but it looks like I can run it manually?

No worries. There are still some rough edges at the OSS-Fuzz side, hence I will be polishing it a bit more. Note that the PR is marked as Draft. Earlier @swebb2066 had explicitly told me to not submit features via GitHub without a mailing list discussion. I will do that when the PR is ready to be reviewed.

What about running it as part of our github actions?

That is an excellent question! :star_struck: Fuzzers are intended to be run for (preferably) long periods of time and regularly. To speed things up, you need to save your state (called corpus), and restore it in the next run. When a fuzzers fails, you need to take a snapshot of the context (preferably, somewhere not disclosed to public for security reasons, because you might have just stumbled upon a vulnerability) to allow reproduction, and continuously verify these reproductions as code base changes. This sophisticated pipeline also needs an administrative UI and reporting features. We can implement such a GitHub Actions workflow, leverage either ASF Subversion repository or some private GitHub repository for storage, etc. But this is a pretty big task! Good news is, this is what OSS-Fuzz exactly does! :sweat_smile: A beefy cluster continuously fuzzing, storing its state & findings to a GCP bucket, and giving visibility to its pipeline state using a web page.

Last 2 months I've learned a lot while trying to integrate Log4j and Log4cxx to OSS-Fuzz. If I would do it all over again, and have more time+financial budget, I'd go the GHA route. This would not only put us in complete control, but also result in a product that any GitHub project can use. But alas there is a deadline I need to deliver this task.