ianlancetaylor / libbacktrace

A C library that may be linked into a C/C++ program to produce symbolic backtraces
Other
944 stars 220 forks source link

Stack overflow in `elf_add` if debug info found by build id contains MiniDebugInfo #129

Closed godlygeek closed 2 months ago

godlygeek commented 2 months ago

If a shared library contains an NT_GNU_BUILD_ID note, elf_add will attempt to load its debug info from /usr/lib/debug/.build-id, recursing to process the located debug info if it's found. If that debug info contains a .gnu_debugdata section, elf_add decompresses that MiniDebugInfo and recurses again to process it. If the MiniDebugInfo contains the same NT_GNU_BUILD_ID note as the original shared library, this pairwise recursion will repeat forever, until either the ulimit -n limit on the number of open file descriptors is reached or the stack overflows.

For a minimal-ish reproducer:

#!/bin/bash

git clone https://github.com/ianlancetaylor/libbacktrace.git
(cd libbacktrace && ./configure --prefix=$PWD/../libbacktrace-install && make -j install)

cat >main.c <<"EOF"
#include <stdint.h>
#include <stdio.h>

#include <backtrace.h>

void err_callback(void *data, const char *msg, int errnum)
{
    printf("Error: %s\n", msg);
}

int frame_callback(void *data, uintptr_t pc, const char *filename, int lineno, const char *function)
{
    printf("%p\n", (void*)pc);
    return 0;
}

int main()
{
    struct backtrace_state *state = backtrace_create_state("", 1, err_callback, 0);
    if (!state) {
        return 1;
    }
    backtrace_full(state, 0, frame_callback, err_callback, 0);
    return 0;
}
EOF

gcc -g -c main.c -I./libbacktrace-install/include/ -o main.o
gcc main.o -L./libbacktrace-install/lib/ -lbacktrace -o main

# Produce MiniDebugInfo a la https://sourceware.org/gdb/current/onlinedocs/gdb.html/MiniDebugInfo.html

# Extract the list of dynamic symbols from the binary
nm -D main --format=posix --defined-only | awk '{ print $1 }' | sort > dynsyms.txt

# Extract all the text (i.e. function) symbols from the debuginfo.
nm main --format=posix --defined-only | awk '{ if ($2 == "T" || $2 == "t" || $2 == "D") print $1 }' | sort > funcsyms.txt

# Keep all the function symbols not already in the dynamic symbol table.
comm -13 dynsyms.txt funcsyms.txt > symbols_to_keep.txt

# Separate full debug info into debug binary.
objcopy --only-keep-debug main main.debug

# Copy the full debuginfo, keeping only a minimal set of symbols.
objcopy -S --remove-section .gdb_index --remove-section .comment --keep-symbols=symbols_to_keep.txt main.debug mini_debuginfo

# Drop the full debug info from the original binary.
strip --strip-all -R .comment main

# Compress the debug data and inject it into the .gnu_debugdata section of
# both the original binary and the debug binary.
xz -f mini_debuginfo
objcopy --add-section .gnu_debugdata=mini_debuginfo.xz main.debug
objcopy --add-section .gnu_debugdata=mini_debuginfo.xz main

# Install the debug binary into /usr/lib/debug/.build-id
build_id=$(readelf -n main | sed -n '/^.*Build ID:\s*/s///p')
first_two=${build_id:0:2}
rest=${build_id:2}
echo "Installing /usr/lib/debug/.build-id/$first_two/$rest.debug"
sudo sh -c "mkdir -p /usr/lib/debug/.build-id/$first_two && mv main.debug /usr/lib/debug/.build-id/$first_two/$rest.debug"

# Raise the limit on the number of open files
ulimit -n 10000

# Cause a stack overflow
./main

Tracked down while investigating https://github.com/bloomberg/memray/issues/636

ianlancetaylor commented 2 months ago

Does it matter that this is happening in a shared library?

godlygeek commented 2 months ago

No, apparently not. I've updated my reproducer to drop the shared library and move everything directly into the main executable.

ianlancetaylor commented 2 months ago

Thanks.

Is there a reason for this line?

objcopy --add-section .gnu_debugdata=mini_debuginfo.xz main.debug

Here main.debug already contains debug info, so why bother to add a .gnu_debugdata section that points to another copy of the debug info? Am I missing something there? Thanks.

godlygeek commented 2 months ago

You're not missing something, it is indeed a weird thing to do, but strange debug info shouldn't crash libbacktrace 😄

The real world case where we encountered this isn't quite as strange. The Memray memory profiler vendors a copy of libbacktrace, which we patch to support lookups against debuginfod. We hit this stack overflow because a file had been uploaded to our debuginfod server that didn't actually contain full debug info, only the .gnu_debugdata section containing the MiniDebugInfo.

These steps are considerably closer to the "real" case where we hit this issue:

...
objcopy --only-keep-debug main main.debug
objcopy -S --remove-section .gdb_index --remove-section .comment --keep-symbols=symbols_to_keep.txt main.debug mini_debuginfo
strip --strip-all -R .comment main

xz -f mini_debuginfo
objcopy --add-section .gnu_debugdata=mini_debuginfo.xz main

# Install the main binary itself into /usr/lib/debug/.build-id
build_id=$(readelf -n main | sed -n '/^.*Build ID:\s*/s///p')
first_two=${build_id:0:2}
rest=${build_id:2}
echo "Installing /usr/lib/debug/.build-id/$first_two/$rest.debug"
sudo sh -c "mkdir -p /usr/lib/debug/.build-id/$first_two && cp main /usr/lib/debug/.build-id/$first_two/$rest.debug"
...

Same idea as above, except the actual split debug info is discarded, and the /usr/lib/debug/.build-id directory is populated with a copy of the executable itself, with only the MiniDebugInfo and not the full debug info.

godlygeek commented 2 months ago

I'm updating Memray's vendored libbacktrace to do:

diff --git a/elf.c b/elf.c
index 107e26c..e62668b 100644
--- a/elf.c
+++ b/elf.c
@@ -6841,7 +6876,8 @@ elf_add (struct backtrace_state *state, const char *filename, int descriptor,
        }
    }

-      if (!gnu_debugdata_view_valid
+      if (!debuginfo
+     && !gnu_debugdata_view_valid
      && strcmp (name, ".gnu_debugdata") == 0)
    {
      if (!elf_get_view (state, descriptor, memory, memory_size,

I'm pretty sure that's correct: if we've already found real debug info, we can just ignore any MiniDebugInfo. Let me know if I have that wrong 😅

ianlancetaylor commented 2 months ago

Thanks, this should be fixed in the repo now.