google / atheris

Apache License 2.0
1.35k stars 112 forks source link

Several issues/questions #3

Closed DavidKorczynski closed 3 years ago

DavidKorczynski commented 3 years ago

I have spent some time trying to fuzz a native library with Atheris, however, I seem to have some issues.

Consider the PR here https://github.com/google/oss-fuzz/pull/4754

Some of the questions I have:

1) When fuzzing an extension where the native code is not hit in the first two iterations of libFuzzer because the python code I target does some initial processing on the data such that the native extension is not hit, then I get a complaint from libFuzzer that there is no coverage and thus it exits. I feel this is somewhat of a limitation and we should allow the fuzzer to run for a while, i.e. naturally explore the python code and reach the native code eventually. Am not sure if I am completely off here, but this has caused issues for me for a while.

2) What is expected behaviour of providing command line arguments to atheris, in particular providing corpus and seed files?

3) Finally, in relation to a compilation on OSS-Fuzz, what is the expected linking approach? Do we need to do the final linking of the native code with clang++?

TheShiftedBit commented 3 years ago

Hi David,

Sorry it took me so long to get back to you; I've been working on solving #1. Basically, the problem is this: when ASan or UBSan are preloaded, they define certain coverage symbols, and those take precedence over the symbols defined in libFuzzer (which is loaded later). When statically linked, this is solved by ASan/UBSan having weak versions of those symbols, meaning the libFuzzer symbols take precedence. But when they're preloaded, that doesn't work, and libFuzzer doesn't get coverage info. This actually affects native code too, not just Python code.

I've got a solution, but it's horrid. It involves generating patched versions of ASan/UBSan that don't export the weak symbols that should belong to libFuzzer. However, the patch is really fragile. Better solutions would be to either (1) modify LLVM to generate these modified ASan and UBSan shared objects itself, or (2) statically link ASan/UBSan and libFuzzer into the python runtime, and don't link libFuzzer into the Atheris extension.

For question #2, everything should work just like libFuzzer.

For question #3, I think g++ would work too, as long as the pieces were compiled with clang++, but I'm not sure about that.

TheShiftedBit commented 3 years ago

Hi David,

I've updated Atheris with a solution to #1. It doesn't work (yet) with OSS-Fuzz, but seems to solve all the problems. It's much less horrid than my other solution, thankfully!

DavidKorczynski commented 3 years ago

Thanks a lot for your explanation as well as the update. Looking forward to following this project.