Add benchmarks from svrwb-fuzz-benchmark-suite

roachspray commented 4 years ago

I started a similar project last year and since it probably will not be as successful as this one, I figure I should donate the cases I have. They are various OSS apps that have multiple CVE assigned vulnerabilities in a single version; they may be found: https://github.com/veracode-research/svrwb-fuzz-benchmark-suite/tree/master/cases

It seems after a quick look that about 9 of them are not being done by you all (and perhaps some of the sqlite may have gaps between ours). They each include multiple vulnerabilities with CVEs and the samples for those known. They may have other vulns, but that would need to be found as they are used. The apps/libs are:

audiofile
imageworsener
jasper
lame
libarchive
perl
tcpdump
wavpack
ytnef

I am willing to do the work to prep them to be added to your project, but I am curious: should I create an issue for each?

There are additional tests to add that I have not included either, including ChakraCore 1.4.1 (there are multiple vulns), and a few others from recent papers. Any guidance on how to go forward is appreciated.

jonathanmetzman commented 4 years ago

Thank you, what a generous donation! These definitely look interesting.

There are going to be some challenges here like making libFuzzer harnesses for each of these benchmarks. I think the fuzzbench team needs to iron out some details of adding new benchmarks, I'll get back to you on this later this week.

roachspray commented 4 years ago

Definitely no worries and I understand that, so makes sense. The harnesses could be written by me (under veracode) as I should have time coming up. I mostly figured I should offer since I would rather try to have the targets I pulled together used than not and having a variety of cases with multiple vulns would be useful. Thanks for considering.

jonathanmetzman commented 4 years ago

@roachspray We would love to have these benchmarks when you get a chance. I think adding a few at a time would be a good idea though. Since we are only measuring coverage, we've discarded benchmarks (I think libarchive may have been one for example) that cause issues running for 24 hours or for measuring coverage. But I think we'd be less likely to discard benchmarks with known bugs since these are much more expensive than coverage-only benchmarks which we can get from OSS-Fuzz for free. Could you start with integrating the top 3 or 4 benchmarks, we can see how they fare and then proceed from there?

The guide for integrating benchmarks is here. Please reach out if you have any trouble.

roachspray commented 4 years ago

Sounds like a plan. I will let you know if you (all) know if I have any issues and get (the top) 3 or 4 in shape with respect to the guide you pointed out.

roachspray commented 4 years ago

@jonathanmetzman For the case of AFL-like fuzzers in fuzzbench, how does one specify arguments to the target program being fuzzed? Or is the main() of the fuzzing target assumed to be of certain form in the fuzzbench/OSS-Fuzz environment? Not sure if I just missed something obvious when going through the FuzzBench (and OSS-Fuzz ideal) contribution docs or through the code where it puts together the afl-fuzz arguments (fuzzer.py). I understand the libFuzzer case, but unsure on the other. Thanks for any help.. guessing it's my failure to read something properly :)

jonathanmetzman commented 4 years ago

It's a good question, and we don't really answer it in the benchmark integration guide, thanks for pointing this out, i'll try to update the docs. We do try to explain it in our guide on integrating new fuzzers.

Basically, we require each fuzzer to define an env var, FUZZER_LIB that gets linked against the fuzz target. For AFL-like fuzzers this is usually afl_driver.cpp. afl_driver works using AFL's persistent mode: it continuously gets inputs from AFL and calls LLVMFuzzerTestOneInput in a loop.

For fuzzers that don't support this, like eclipser, we encourage use of something like StandaloneFuzzTargetMain.c which basically provides a main function that calls LLVMFuzzerTestOneInput with input provided through stdin and then exits.

roachspray commented 4 years ago

Just pinging to note that I plan to do this work in the coming weekend/next week.

jonathanmetzman commented 4 years ago

Hey @roachspray how is the integration going? No rush, just wanted to see if you need help.

roachspray commented 4 years ago

@jonathanmetzman Initially I thought it would be wise to be able to set things up so I could run an experiment to ensure that the code for a benchmark would build and be fuzzed. However, about halfway going down that path, last weekend, I realized that it was maybe not the best route to take. My hope tomorrow was at least to write the libFuzzer handler code since that stands apart from actually building/running.

My biggest lack of understanding is knowing the environment in which the benchmark would be cloned and built -- e.g., what environment variables should I be aware of to use, where do I inherit what c/c++ compiler to use, do I need to install the built benchmarks somewhere, etc.

There is still code I should read through, so maybe those are answered. Any pointers would be great. Apologies for the delay; slow.

roachspray commented 4 years ago

After looking at harfbuzz and a few other build.sh, it is more obvious/clear to me how to do this. My one concern is that the svrwb-fuzz-benchmark-suite has all the sources from the different projects stored in that repo, rather than pulling from their own project repos. Part of the reason was due to being uncertain if certain versions would still be available over time and wanting to preserve them. This makes things slightly ugly for fuzzbench given how each benchmark target (outside of oss-fuzz) uses a project repo.

Is there a desire to have things one way or another, or is it ok for each target I add to clone the full svrwb benchmarks and build the correct project? That is, in adding tcpdump, it would checkout the whole suite, but only build the bits to run tcpdump itself. This may not be the best example given that the tcpdump project will likely have the sources around for a long time, but I hope you can understand my question.

jonathanmetzman commented 4 years ago

Sorry for not getting back to you on your first comment. It seems you've figured it out but definitely let us know if you have more questions.

Is there a desire to have things one way or another, or is it ok for each target I add to clone the full svrwb benchmarks and build the correct project? That is, in adding tcpdump, it would checkout the whole suite, but only build the bits to run tcpdump itself. This may not be the best example given that the tcpdump project will likely have the sources around for a long time, but I hope you can understand my question.

I think I understand the question. Rather than clone tcpdump and checkout version "foo", clone svrwb-fuzz-benchmark-suite which has the source of tcpdump version "foo".

I don't feel extremely strongly about this, but I think cloning the sources from upstream rather than getting them from svrwb-fuzz-benchmark-suite is better or a few reasons:

It's more obvious to someone reading the code what's going on.
There's no dependency on a third party repo: svrwb-fuzz-benchmark-suite.
We're planning on using some automation to figure out the version of a benchmark and then publish that data in reports (see https://github.com/google/fuzzbench/issues/94). If we got, for example, the tcpdump benchmark from svrwb-fuzz-benchmark-suite, the version would be the version of svrwb-fuzz-benchmark-suite rather than tcpdump.

@inferno-chromium WDYT?

roachspray commented 4 years ago

@jonathanmetzman Sorry to bounce around from PR to Issue... any tips on how to debug the running of afl-fuzz in make run-$FUZZER-$BENCHMARK? That is, I am receiving:

[*] Attempting dry run with 'id:000000,orig:0b2b1eee048ff89e049e16139d6c760c4637878d'...
[*] Spinning up the fork server...

[-] Hmm, looks like the target binary terminated before we could complete a
    handshake with the injected code. Perhaps there is a horrible bug in the
    fuzzer. Poke <lcamtuf@coredump.cx> for troubleshooting tips.

[-] PROGRAM ABORT : Fork server handshake failed
         Location : init_forkserver(), afl-fuzz.c:2264

INFO:root:Doing final sync.

and I would like to gdb and see what I broke in my harness, or if it is something else. With this setup, can I just add my usual gdb commands or is there something I should do to make it easier to debug? I can locally build with afl and have afl-fuzz run on my tcpdump harness, but crashes in docker.

Reference branch: https://github.com/veracode-research/fuzzbench/tree/svrwb_contrib_01a and the benchmark case above is tcpdump-4.9.0

jonathanmetzman commented 4 years ago

Have you tried make debug-afl-$BENCHMARK? This gives you a shell on the container you can use to debug. I haven't tried gdb in this setup, I'm vaguely aware from OSS-Fuzz of there being some issues with using gdb in docker.

roachspray commented 4 years ago

Have you tried make debug-afl-$BENCHMARK?

No; thank you...will look!

jonathanmetzman commented 4 years ago

Also even though I generally like gdb for debugging (but I'm far from a wiz at it) I find it's not very helpful in debugging issues with afl targets if those targets need to be run by afl to reproduce the issue. print statements are more helpful in these circumstances.

roachspray commented 4 years ago

Also even though I generally like gdb for debugging (but I'm far from a wiz at it) it's not very helpful in debugging issues with afl targets if those targets need to be run by afl to reproduce the issue. print statements are more helpful in these circumstances.

Agreed... I have had mixed results with gdb vs printf for afl. Just not sure why things are OK in my world, but not in docker :-//... my main difference is I am not using the libAFL.a route locally ("local" here i mean on my device outside of docker) ... which is a big difference, as I am not so familiar still with this setup. Another difference is outside of docker i do not need to specify -lssl -lcrypto -ldbus-1 for the fuzzer harness build.

All that aside the jasper benchmark works... so, perhaps just need to take a look. I appreciate the help.

jonathanmetzman commented 4 years ago

I think the process for building afl fuzzers in fuzzbench is very similar to the one outlined here: https://github.com/llvm-mirror/compiler-rt/blob/master/lib/fuzzer/afl/afl_driver.cpp#L24 So you might be able to build the targets on your host and repro. Or you could copy the binaries right/ Though this assumes the issue isn't docker specific.

roachspray commented 4 years ago

Welp, my issue was found when looking at base-runner/Dockerfile to add gdb... :-/ Somehow I assumed base-builder would have the libs i added there in the runtime :-/ Ack. Well, good. :)

jonathanmetzman commented 4 years ago

Ah, please add the dependencies to base-runner. (I guess for the runner we can't avoid having all dependencies on every image with the current model, but I'd like to change it at some point).

jonathanmetzman commented 4 years ago

When you do this, if possible could you leave a comment stating which benchmark each dependency is for? That way it will be easier when I break it up. (though now that I think about this, it may be best to not allow runtime dependencies in the future, OSS-Fuzz seems to do well with this model).

roachspray commented 4 years ago

When you do this, if possible could you leave a comment stating which benchmark each dependency is for? That way it will be easier when I break it up. (though now that I think about this, it may be best to not allow runtime dependencies in the future, OSS-Fuzz seems to do well with this model).

Will do

inferno-chromium commented 4 years ago

In a real-world scenario, in 5000+ fuzz targets (which is like benchmark in FuzzBench context) across Chromium, OSS-Fuzz, Google, we always prefer having all the dependencies inside a project/benchmark archive. System dependencies separately is a huge pain to manage and also bad for developer reproduction. You should be able to build everything in /out and then put dependencies there as well and change rpaths for main fuzz target binary to point to dependencies. That way, we just archive everything in /out and it can work easily across any bots.

If you just need some packages for debugging, that is not related to benchmark, and we can easily add those helper stuff in base-runner or base-image (if both builder and runner need this).

roachspray commented 4 years ago

In a real-world scenario, in 5000+ fuzz targets (which is like benchmark in FuzzBench context) across Chromium, OSS-Fuzz, Google, we always prefer having all the dependencies inside a project/benchmark archive. System dependencies separately is a huge pain to manage and also bad for developer reproduction. You should be able to build everything in /out and then put dependencies there as well and change rpaths for main fuzz target binary to point to dependencies. That way, we just archive everything in /out and it can work easily across any bots.

That makes 100% sense. Adjusting my approach now.

If you just need some packages for debugging, that is not related to benchmark, and we can easily add those helper stuff in base-runner or base-image (if both builder and runner need this).

Thank you!

google / fuzzbench

Add benchmarks from svrwb-fuzz-benchmark-suite #50