google / oss-fuzz

OSS-Fuzz - continuous fuzzing for open source software.
https://google.github.io/oss-fuzz
Apache License 2.0
10.59k stars 2.25k forks source link

need better interface to gdb debugger on Linux #11537

Open jreiser opened 10 months ago

jreiser commented 10 months ago

oss-fuzz should improve the interface between the fuzzer harness and the gdb debugger on Linux. By using "gdb server", the valgrind and qemu projects both have a vastly better interface to gdb than oss-fuzz. For instance, valgrind --vgdb=yes lets you invoke gdb my_exectuable; target remote on another Terminal, and activates that gdb at the exact point of every detected error. Similarly, qemu-${ARCH}-static -g 1234 lets you multi-arch-gdb my_executable; target remote :1234 to debug any executable anywhere, with full debugger control.

As noted in https://google.github.io/oss-fuzz/advanced-topics/debugging/#debugging-fuzzers-with-gdb , The base-runner-debug image does not have access to your sources, so you will not be able to do source code level debugging. This limitation prevents effective debugging.

The next sentence continues We recommend integrating your fuzz target upstream as part of [ideal integration](https://google.github.io/oss-fuzz/advanced-topics/ideal-integration/) for debugging purposes. But the page https://google.github.io/oss-fuzz/advanced-topics/ideal-integration/ highlights serious problems:

For every fuzz target foo in the project, there is a build rule that builds foo_fuzzer, a binary that:

    Contains the fuzzing entry point.
    Contains (LLVMFuzzerTestOneInput) and all the code it depends on.
    Uses the main() function from $LIB_FUZZING_ENGINE (env var [provided](https://google.github.io/oss-fuzz/getting-started/new-project-guide/) by OSS-Fuzz environment).

Those directions are deficient because there is no completely-worked literal example, for instance an actual Makefile which builds an actual C-language printf("Hello world!\n"); main program. And please give an actual current value for $LIB_FUZZING_ENGINE, along with the location of copy-and-paste OSS-Fuzz environment. Both the name of the shell variable and an actual literal example value must be provided. Imitating a literal example is easier, faster, and more informative.

And finally, my actual experience of debugging with oss-fuzz. My project is UPX https://github.com/upx/upx . The project uses CMake, and produces both a release and a debug variant output (even before the fuzzer gets added). The debug build already integrates the Address sanitizer and the Undefined Behavior sanitizer of C/C++ compiled by gcc and/or clang. So UPX gets some of the benefit of the sanitizer even without OSS-Fuzz. But when the debug variant of UPX is fuzzed, then interactive debugging is impossible because the fuzzer (when wrapped around the existing debug variant) does not co-operate with gdb. In order to debug a fuzzer-reported issue, then I must guess the environment that the fuzzer set up before invoking the debugee. In the case of fuzzing UPX, I am fortunate that the name of the testcase implies the parameters that were supplied: upx -t or upx -l. Thus I can re-run UPX directly under gdb with an equivalent parameters and name of testcase, then debug interactively that way.

Therefore: the Detailed report should state explicitly the environment variables that the fuzzer set, the values of the sanitizer global variables that it sets, and the equivalent command-line invocation (execve) of the debugee. The fuzzer invocation should also optionally enable connection to a remote gdb, much like valgrind or qemu.

DavidKorczynski commented 10 months ago

the Detailed report should state explicitly the environment variables that the fuzzer set

You can extract this when the fuzzers are build as it's printed to stdout. Specifically for UPX you can do:

ASAN:

python3 infra/helper.py build_fuzzers --sanitizer=address upx
...
...
---------------------------------------------------------------
CC=clang
CXX=clang++
CFLAGS=-O1 -fno-omit-frame-pointer -gline-tables-only -DFUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION -fsanitize=address -fsanitize-address-use-after-scope -fsanitize=fuzzer-no-link
CXXFLAGS=-O1 -fno-omit-frame-pointer -gline-tables-only -DFUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION -fsanitize=address -fsanitize-address-use-after-scope -fsanitize=fuzzer-no-link -stdlib=libc++
RUSTFLAGS=--cfg fuzzing -Zsanitizer=address -Cdebuginfo=1 -Cforce-frame-pointers
---------------------------------------------------------------

and for UBSAN:

python3 infra/helper.py build_fuzzers --sanitizer=undefined upx
...
...
---------------------------------------------------------------
CC=clang
CXX=clang++
CFLAGS=-O1 -fno-omit-frame-pointer -gline-tables-only -DFUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION -fsanitize=array-bounds,bool,builtin,enum,float-divide-by-zero,function,integer-divide-by-zero,null,object-size,return,returns-nonnull-attribute,shift,signed-integer-overflow,unsigned-integer-overflow,unreachable,vla-bound,vptr -fno-sanitize-recover=array-bounds,bool,builtin,enum,float-divide-by-zero,function,integer-divide-by-zero,null,object-size,return,returns-nonnull-attribute,shift,signed-integer-overflow,unreachable,vla-bound,vptr -fsanitize=fuzzer-no-link
CXXFLAGS=-O1 -fno-omit-frame-pointer -gline-tables-only -DFUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION -fsanitize=array-bounds,bool,builtin,enum,float-divide-by-zero,function,integer-divide-by-zero,null,object-size,return,returns-nonnull-attribute,shift,signed-integer-overflow,unsigned-integer-overflow,unreachable,vla-bound,vptr -fno-sanitize-recover=array-bounds,bool,builtin,enum,float-divide-by-zero,function,integer-divide-by-zero,null,object-size,return,returns-nonnull-attribute,shift,signed-integer-overflow,unreachable,vla-bound,vptr -fsanitize=fuzzer-no-link -stdlib=libc++
RUSTFLAGS=--cfg fuzzing -Cdebuginfo=1 -Cforce-frame-pointers
---------------------------------------------------------------

, the values of the sanitizer global variables that it set

Am not sure specifically what you're referring to here. Is it stuff such as which of the following flags are used https://github.com/google/sanitizers/wiki/AddressSanitizerFlags#run-time-flags ? or?

In order to debug a fuzzer-reported issue, then I must guess the environment that the fuzzer set up before invoking the debugee.

If you build using the above compilation flags you should be able to reproduce this without guessing?

jreiser commented 10 months ago

On 1/29/24 04:41, DavidKorczynski wrote:

the Detailed report should state explicitly the environment
variables that the fuzzer set

You can extract this when the fuzzers are build as it's printed to stdout. Specifically for UPX you can do:

ASAN:

python3 infra/helper.py build_fuzzers --sanitizer=address upx ... ... --------------------------------------------------------------- CC=clang CXX=clang++ CFLAGS=-O1 -fno-omit-frame-pointer -gline-tables-only -DFUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION -fsanitize=address -fsanitize-address-use-after-scope -fsanitize=fuzzer-no-link CXXFLAGS=-O1 -fno-omit-frame-pointer -gline-tables-only -DFUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION -fsanitize=address -fsanitize-address-use-after-scope -fsanitize=fuzzer-no-link -stdlib=libc++ RUSTFLAGS=--cfg fuzzing -Zsanitizer=address -Cdebuginfo=1 -Cforce-frame-pointers
[[snip]]

Thank you for the quick response.

However, the commentary from "infra/helper.py build_fuzzer" is only for compile-time of the fuzzer executable. Additional things matter at run time.

, the values of the sanitizer global variables that it set

Am not sure specifically what you're referring to here. Is it stuff such as which of the following flags are used https://github.com/google/sanitizers/wiki/AddressSanitizerFlags#run-time-flags https://github.com/google/sanitizers/wiki/AddressSanitizerFlags#run-time-flags ? or?

Yes, those are the names and default values of the sanitizer run-time variables that matter. But I need to know the actual values that the fuzzer specified for any particular Detailed Report.

In order to debug a fuzzer-reported issue, then I must guess the
environment that the fuzzer set up before invoking the debugee.

If you build using the above compilation flags you should be able to reproduce this without guessing?

I cannot find where the compilation flags for a fuzzer build specify the effective argv[] and envp[] (for values that are controlled by the fuzzer and/or sanitizer) that upx/main() sees at run time. The name of the environment variable whose value is the run-time argument list, is not documented. And the output from infra/helper.py is thousands of lines long. I want the effective argv[] and envp[] to be listed literally in the Detailed Report.

More generally: the less that I must know about the fuzzer, the better. On Linux or any *nix system, and for any run-time analyzer such as a fuzzer, valgrind, qemu, etc., then there is point where the analyzer makes a subroutine call that is equivalent to the shell invocation of execve(path, argv, envp) for the debugee. This is a cut-point in the graph for understanding what executes. This makes the equivalent execve() supremely important, and it should be listed explicitly in the Detailed Report.

Once I know the equivalent execve(), then I can invoke any program processor: the fuzzer, any other analyzier (such as qemu, valgrind, ...), any debugger, etc. In particular, the equivalent execve() names the Reproducible Testcase file. For valgrind and qemu, the equivalent execve() is obvious: the tail of the commandline which invoked valgrind or qemu. [In case of doubt, then a commandline argument of "--" (two minus signs) signifies the end of parameters to the analyzer, and the start of parameters to the debugee.] But so far in all the Detailed Reports that I have seen, then the fuzzer hides the equivalent execve(). Don't.

How I debug so far: guess the equivalent command line from the name of the test, invoke the debugger gdb on a upx which is the debug variant produced by our usual cmake build (contains Address sanitizer and Undefined behavior sanitizer), specify the Reproducible Testacase file as input, plant a breakpoint at a likely spot determined from the traceback from Detailed Report, and debug once the breakpoint is hit. What I want is to construct the invocation of gdb directly using copy+paste from the Detailed Report. This would shorten the learning curve (no need to know about, or wade through the output from, infra/helper.py), and reduce the time from Detailed Report to point-of-error in gdb.

[Both valgrind and qemu make it easy to use gdbserver, so that going from the Detailed Report to the debugger at that point of error is instantaneous. So nice and fast!]

-- John

DavidKorczynski commented 10 months ago

@jreiser I think I understand the situation now -- I think it's likely because you didn't write the harnesses that things got confusing.

There are currently three fuzzers for UPX

The source code these are just a few lines of code, however, the arguments to upx_main are hardcoded:

Does this provide some clarification?

DavidKorczynski commented 10 months ago

For reference, the fuzzers I linked to are all "in-processfuzzing, meaningLLVMFuzzerTestOneInputin each of the fuzzers will be called N (many) times in each process. Thus,upx_main` is called over and over again inside of the same process by way fuzzer.