Closed core-explorer closed 2 weeks ago
Valgrind finds no problems with dwarfdump --check-loc elf.dbg
dwarfdump says: /tmp/dwarfdump ERROR: ERROR: dwarf_get_loclist_c fails: DW_DLE_LOCLISTS_ERROR: An lle entry begins past the end of its allowed space. Corrupt DWARF.. Attempting to continue.
CU Name = (indexed string: 0x00000001)main.cpp CU Producer = (indexed string: 0x00000000)Ubuntu clang version 18.1.3 (1ubuntu1) DIE OFF = 0x00000072 GOFF = 0x000000ec, Low PC = 0x00001140, High PC = 0x00001157
/tmp/dwarfdump ERROR: ERROR: Cannot get location list data: DW_DLE_LOCLISTS_ERROR: An lle entry begins past the end of its allowed space. Corrupt DWARF.. Attempting to continue.
CU Name = (indexed string: 0x00000001)main.cpp CU Producer = (indexed string: 0x00000000)Ubuntu clang version 18.1.3 (1ubuntu1) DIE OFF = 0x00000072 GOFF = 0x000000ec, Low PC = 0x00001140, High PC = 0x00001157
/tmp/dwarfdump ERROR: Cannot get location data, attr (with -M also form) follow: DW_DLE_LOCLISTS_ERROR: An lle entry begins past the end of its allowed space. Corrupt DWARF.. Attempting to continue.
CU Name = (indexed string: 0x00000001)main.cpp CU Producer = (indexed string: 0x00000000)Ubuntu clang version 18.1.3 (1ubuntu1) DIE OFF = 0x00000072 GOFF = 0x000000ec, Low PC = 0x00001140, High PC = 0x00001157
There were 3 DWARF errors reported: see ERROR above.
In other words, it's a fuzzed binary but dwarfdump behaves normally.
I do not have lldb available at this time.
The code in the current version reads:
179 if (data >= enddata) {
180 _dwarf_error_string(dbg,error,DW_DLE_LOCLISTS_ERROR,
181 "DW_DLE_LOCLISTS_ERROR: "
182 "An lle entry begins past the end of "
183 "its allowed space. Corrupt DWARF.");
184 return DW_DLV_ERROR;
185 }
186 startdata = data;
187 code = *data;
188 ++data;
189 ++count;
190 switch(code) {
so I presume you are not testing a current dwarfdump???
Thank you for your swift analysis, you are correct, I was not using the latest version. I have now updated and the issue persists. This is what I see:
code = *data;
in read_single_lle_entry()
line 187
causes a segmentation fault because the value of data
is not within any mapped memory region.
I used reverse debugging to track the origin of this value:
There is a validity check ensuring data < end
immediately preceding line 187.
data
is passed in from build_array_of_lle ()
in line 1142.
It is computed as
data = rctx->ll_llepointer;
in line 1121
ll_llepointer
is computed as
llhead->ll_llepointer = lle_global_offset + dbg->de_debug_loclists.dss_data;
in _dwarf_loclists_full_in_lle_head()
in line 1378.
And lle_global_offset
is an attacker-controlled value read a couple lines earlier.
A suitably chosen value for lle_global_offset
will cause integer overflow and cause data
< end
I cannot reproduce the crash when running via valgrind, I presume that is because valgrind maps memory at different addresses and does not trigger integer overflow in the address calculation.
After cleaning up some stuff for a full retest I now find that -fsanitized finds a bug just as you reported:
> AddressSanitizer:DEADLYSIGNAL
> =================================================================
> ==2917684==ERROR: AddressSanitizer: SEGV on unknown address 0x505f00000524 (pc 0x56e55efed17b bp 0x7ffc3c2ef420 sp 0x7ffc3c2ef0b0 T0)
> ==2917684==The signal is caused by a READ memory access.
> #0 0x56e55efed17b in read_single_lle_entry ../../../../home/davea/dwarf/code/src/lib/libdwarf/dwarf_loclists.c:187
> #1 0x56e55eff74b8 in build_array_of_lle ../../../../home/davea/dwarf/code/src/lib/libdwarf/dwarf_loclists.c:1142
> > AddressSanitizer:DEADLYSIGNAL
> =================================================================
> ==2917684==ERROR: AddressSanitizer: SEGV on unknown address 0x505f00000524 (pc 0x56e55efed17b bp 0x7ffc3c2ef420 sp 0x7ffc3c2ef0b0 T0)
> ==2917684==The signal is caused by a READ memory access.
> #0 0x56e55efed17b in read_single_lle_entry ../../../../home/davea/dwarf/code/src/lib/libdwarf/dwarf_loclists.c:187
> #1 0x56e5#2 0x56e55eff74b8 in _dwarf_loclists_fill_in_lle_head ../../../../home/davea/dwarf/code/src/lib/libdwarf/dwarf_loclists.c:1381
> #3 0x56e55efdc767 in dwarf_get_loclist_c ../../../../home/davea/dwarf/code/src/lib/libdwarf/dwarf_loc.c:1719
> #4 0x56e55ee96e57 in print_location_list ../../../../home/davea/dwarf/code/src/bin/dwarfdump/print_die.c:6659
> #5 0x56e55eea59d3 in print_location_description ../../../../home/davea/dwarf/code/src/bin/dwarfdump/print_die.c:4482
> #6 0x56e55eea59d3 in print_attribute ../../../../home/davea/dwarf/code/src/bin/dwarfdump/print_die.c:5031
> #7 0x56e55eeaef24 in print_one_die ../../../../home/davea/dwarf/code/src/bin/dwarfdump/pr
Why this failed to reproduce earlier is a mystery. I don't like mysteries like this.
Yes, in dwarf_loclists_fill_in_lle_head() we read in a loclistx value and fail to check it for sanity.. Thank you for finding this.
I found this by accident. The fuzzer I used is also used by the ossfuzz project for libdwarf-code. I had a look at libdwarf-binary-samples, and I don't think that is a good set of binaries to fuzz libdwarf.
Very interesting observations. I will be thinking about this and about what might be done. As you observe, all the test cases and the fuzzing harness was created by google folks. I did not think of them as 'for beginners' (though I do think that's apt) but as 'how to violate the rules of using the library without literally violating them.' and it was quite effective. I did fix some of the example source as passing random pointers (uninitialized fields) to the library would make the test results not-reliably-reproducible.
It will be a few days before I can do anything about the clear bug you found, there is a confusing issue related to the non-standard DWARF5 GNU split-dwarf extension that I need to work on. It may be a few (more) days before I get back to this issue. But I will. ... lets leave this open.
Thank you for your work on this.
FYI
I have found four places where a value later used is not checked for sanity. between dwarf_loclists.c and dwarf_rnglists.c READ_UNALIGNED_CK() macro is fine, but inot all have ht proper test for sanity following (some cases like version and address_size are checked later)
I meant that the C source files in binary-samples-v2/src are taken from https://github.com/DarrenRainey/C-Examples which literally claims "Example code written in C for beginners". These examples do not include a linked list.
I've been parsing the debug information for glibc internals, and that is very different C source code (plus the occasional assembly). But maybe that doesn't actually matter for libdwarf, its just more tags and attributes. You probably want to cover recursive and nested data structures as well as function inlining, and lots of references between things.
I would want to see a lambda expression in the C++ source, that language is full of opportunities where the compiler needs to do a lot of magic that needs to be captured in the debug information.
I just pushed updates with four new checks for bad input read from disk (by libdwarf). rangelists and loclists.
Now your testcase generates an error , as we would hope.
I believe this is now fixed.
Pushed the fix, so closing this.
I built libdwarf-code with the afl++ fuzzer as compiler and used a set of minimal C files compiled with gcc and clang as seed inputs. My version is origin/main (459c9153, 10 commits after v0.9.2)
This is the backtrace:
I have not investigated further.
The input file is attached. I can share my fuzzing setup if there is interest. elf.dbg.gz