google / oss-fuzz

OSS-Fuzz - continuous fuzzing for open source software.
https://google.github.io/oss-fuzz
Apache License 2.0
10.39k stars 2.21k forks source link

Improve documentation regarding integration with oss-fuzz #5005

Open tytso opened 3 years ago

tytso commented 3 years ago

The documentation at: https://google.github.io/oss-fuzz/advanced-topics/ideal-integration/

doesn't give enough instructions about how to do the integration. In particular, it doesn't say anything about how to link the fuzzing target with the fuzzing engine. I can see from build sh that the build rule is:

$CXX $CXXFLAGS \ $LIB_FUZZING_ENGINE \ -I $SRC/e2fsprogs/lib \ $fuzzer \ -L'./lib/ext2fs' -lext2fs \ -L'./lib/et' -lcom_err \ -o $OUT/$fuzzer_basename

But there is nothing about where to find or build the LIB_FUZZING_ENGINE. It's one thing to build and run the fuzzing target in a Docker environment, but if you want to encourage the use of upstream developers to do fuzzing, I'd suggest that you try to get the fuzzing engines into Fedora and Debian, so that upstream developers can install some package, and then be able to build the fuzzer without needing to drag down a Docker environment. The easier you can make it for upstream developers, the more likely they will engage. A small amount of work by the oss-fuzz team will have a huge multiplier effect, and if you have OKR's about increasing the number of upstream projects that engage using the "ideal integration" method, make it simpler and easier for developers to use the fuzzer in their native development environment will significantly improve your OKR metrics.

inferno-chromium commented 3 years ago

You are looking at the advanced docs, initial project integration docs at https://google.github.io/oss-fuzz/getting-started/new-project-guide/

tytso commented 3 years ago

For bonus points, after you get the fuzzer packaged into the Community distributions, contribute an autoconf macro to the Autoconf Archive (e.g., the equivalent of CPAN for autoconf) and similar macros for automake, so that the instructions can be boiled down to "install this package", "insert this autoconf macro", and then "if you use Makefile, insert this Makefile fragment; and if you use automake, insert this single line".

tytso commented 3 years ago

Right, but from an upstream developer's perspective, they are going to want to run the fuzzer in their native development environment. The problem is they are going to want to run the fuzzing target in a debugger, or using valgrind, etc., to get more information and insight as to what is going on. The turnkey "Docker" setup might be fine for automated fuzzing, but for development purposes, it's really not a great way to go.

If you want to close the bug, it's your perogative. But if you want upstream maintainers to engage, you really should consider making oss-fuzz more upstream friendly. And docker isn't really upstream friendly.

tytso commented 3 years ago

Note that the instructions for how to reproduce a fuzzer, for people who are asked to investigate a fuzzing report, requires using the "advanced topics":

https://google.github.io/oss-fuzz/advanced-topics/reproducing/

It talks about running the fuzzer directly:

$ ./fuzz_target_binary

But you can't do that from Docker, and it requires building the fuzz target. The documentation for reproducing the failure asks the developer to go to the "advanced topic" of "ideal integration", which doesn't help the developer build the fuzz target. I almost gave up then, but it just got me massively annoyed. Keep in mind that the reproducing instructions is where most upstream developers may enter into your documentation, and there was no information about where the sources of the fuzzing target can even be found. Which was the next source of anger and frustration about how oss-fuzz was wasting my time.....

Again, if you want upstream developers to engage with oss-fuzz, it needs have a much friendlier upstream-developer experience!

oliverchang commented 3 years ago

Note that the instructions for how to reproduce a fuzzer, for people who are asked to investigate a fuzzing report, requires using the "advanced topics":

https://google.github.io/oss-fuzz/advanced-topics/reproducing/

This is indeed the first page that maintainers are linked to. It's included in every bug report on monorail, and is designed to be self contained.

It talks about running the fuzzer directly:

$ ./fuzz_target_binary

But you can't do that from Docker, and it requires building the fuzz target.

Yes, this is mentioned on our reproduction page:

"If you’re not sure how to build the fuzzer using the project’s build system, you can also use Docker commands to replicate the exact build steps used by OSS-Fuzz, then feed the reproducer input to the fuzz target (how?, why?)."

I think this is the main issue you encountered. You went down a rabbit hole by clicking our links rather than reading the rest of this page? Perhaps we need to make this more prominent as we get more cases of projects integrated without maintainer involvement.

The documentation for reproducing the failure asks the developer to go to the "advanced topic" of "ideal integration", which doesn't help the developer build the fuzz

Did you try the helper scripts and steps in the following steps at https://google.github.io/oss-fuzz/advanced-topics/reproducing/#building-using-docker at all?

tytso commented 3 years ago

I eventually found that, but that doesn't help me actually run the fuzzer under gdb. I did find[1], but it's not terribly helpful. One of the problems with using Docker environments is that it's not at all clear how to get the sources in my working directory (do I really have to upload commits into my public git repro in order to test anything with the fuzzer?!?) as well as the test case file into the Docker environment.

[1] https://google.github.io/oss-fuzz/advanced-topics/debugging/

What would be really helpful is a simple, stupid, linear set of instructions that starts with "make sure these packages are installed" (give the Debian and Fedora package lists), and at the end, allow the developer to run the fuzzer under gdb, with access to the test case file, and full sources so gdb is useful. Please do not assume that the developer conversant with Docker, so if you need to use magic docker commands, list them out explicitly so the upstream developer can just cut and paste commands. (This is why I find Docker to be extremely unfriendly for developers. A docker environment appears to be a walled garden where most of the developer's tools Just Don't Work.)

Something no more complex than this would be great. I wrote these instructions[2] so that grad students who weren't conversant with qemu, or kernel development, but who might be creating research file systems for a FAST paper could follow it. I had more detailed documentation in other places, but having a really simple, stupid quick start of a the critical user (developer) journey was designed to make it easy for a project newbie to get started. It's been intern and graduate student tested. :-)

[2] https://github.com/tytso/xfstests-bld/blob/master/Documentation/kvm-quickstart.md

tytso commented 3 years ago

One more thought.... maybe if there's an easy way to take the docker image, and extract out a tarball of the entire file system image, so the developer can unpack it, bind-mount whatever they need, and then they can manually chroot into the environment? The complete isolation of Docker isn't really needed when running a reproduction, and is actually more of a bug rather than a feature. Furthermore, shell scripts which run chroot are much easier to customize and understand, as compared to the complete non-transparency of things like proprietary Docker binaries.

Maybe a Docker wizard knows how to poke all sorts of holes into a docker container, but forcing upstream developers to learn how to work around Docker's restrictions is really unfair.

maflcko commented 3 years ago

5455 might help here. It should be the responsibility of the upstream project to provide docs on how to compile their software with fuzz engines (and optionally sanitizers) enabled.