ianlancetaylor / libbacktrace

A C library that may be linked into a C/C++ program to produce symbolic backtraces
Other
944 stars 220 forks source link

Libbacktrace fails to resolve symbols, if debug/symbol information resides (only) in a seperate file #113

Closed pickard1 closed 6 months ago

pickard1 commented 11 months ago

I'm not sure if I'm doing anything wrong yet. But symbols are not resolved if they are found in a separate file (.gnu_debuglink method). I have debugged the code so far and have seen that the file is found but when calling elf_add recursively, debug_info is set to 1. This results in the symbol table not being parsed/loaded. If I change the code in elf_add() as shown here everything works.

if (symtab_shndx == 0) symtab_shndx = dynsym_shndx; //if (symtab_shndx != 0 && !debuginfo) if (symtab_shndx != 0)

Where is the mistake ?

ianlancetaylor commented 11 months ago

Can you provide the exact steps you are using? There is already a test for using .gnu_debuglink (btest_gnudebuglink); is that test passing for you? If so, what are you doing differently? Thanks.

pickard1 commented 11 months ago

I'm not sure how to change Makefile.am so the test is build. Even if I add btest_gnudebuglink to any of the variables check_PROGRAMS, TESTS, MAKETESTS or BUILDTESTS nothing was build. What am I doing wrong

What I did in my test is relatively simple. I built our application without dynamic symbols and artificially added a segfault exception. Then I called the following:

objcopy --only-keep-debug IPEmotionRT IPEmotionRT.debug objcopy --add-gnu-debuglink=IPEmotionRT.debug IPEmotionRT strip --strip-all IPEmotionRT

When I then start the application, our backtrace handler is called, but the symbols of our application are not resolved.

ianlancetaylor commented 11 months ago

The test should be built automatically if the configure script detects that your system supports the required features.

I don't understand the commands that you list, because the second objcopy command has only a single filename. objcopy requires two file names.

Either way, does it help if, instead of using separate objcopy and strip commands, you run

objcopy --strip-debug --add-gnu-debuglink=IPEmotionRT.debug IPEmotionRT IPEmotionrt.stripped

Then you don't need to run strip.

If that doesn't help please show me the exact commands that you are running.

pickard1 commented 11 months ago

Accordingly, something is missing on our system if the test is not built automatically. Could it have something to do with the fact that we're cross-compiling? The host system is Linux/x86. The target system is Linux/aarch64. Build environment is buildroot.

I don't want to rule out whether we made a mistake when generating the binary. But I can't understand why our commands should be wrong. The “out-file” parameter in objcopy is only optional. The result file in your example is much larger, which suggests that symbols are still there. objdump also confirms this.

This is the result of your call:

-rwxr-xr-x 1 fabrice fabrice 7088624 Oct 6 06:58 IPEmotionRT -rwxr-xr-x 1 fabrice fabrice 2795952 Sep 28 13:47 IPEmotionRT.debug -rwxr-xr-x 1 fabrice fabrice 7071928 Oct 6 07:00 IPEmotionRT.stripped

and with objdump -t IPEmotionRT.stripped you can see still symbols

IPEmotionRT.stripped: file format elf64-littleaarch64

SYMBOL TABLE: 000000000040028c l O .note.ABI-tag 0000000000000020 abi_tag 000000000047b914 l F .text 0000000000000014 call_weak_fn 0000000000481e1c l F .text 00000000000000b4 _ZSt13adjust_heapIN9__gnu_cxx17normal_iteratorIPcSt6vectorIcSaIcEEEElcNS0_5ops15_Iter_less_iterEEvT_T0_SA_T1T2.isra.0 000000000047bc18 l F .text 000000000000002c _ZNSt6bitsetILm256EE9referenceaSEb.isra.0 00000000004821d0 l F .text 000000000000003c _ZNSt8_Rb_treeIlSt4pairIKllESt10_Select1stIS2_ESt4lessIlESaIS2_EE8_M_eraseEPSt13_Rb_tree_nodeIS2_E.isra.0 000000000047bc44 l F .text 0000000000000038 _ZNSt7__cxx1112regex_traitsIcE10RegexMaskoRES2.isra.0 ...........

Consequently, the symbols are also resolved because they are still present in the binary.

According to my calls it looks like this:

-rwxr-xr-x 1 fabrice fabrice 4295824 Oct 6 07:14 IPEmotionRT -rwxr-xr-x 1 fabrice fabrice 2795952 Oct 6 07:14 IPEmotionRT.debug

You can see that the binary is a lot smaller. It's also logical because the static symbols are missing. You can also see this with objdump

and objdump -t IPEmotionRT returns:

IPEmotionRT: file format elf64-littleaarch64

SYMBOL TABLE: no symbols

ianlancetaylor commented 11 months ago

Thanks, I think I see what you are getting at.

pickard1 commented 10 months ago

Is there anything else I can do? Can I still support in any way?

pickard1 commented 9 months ago

Sorry that I have to follow up again here. Is there a schedule for when you can take a look at my problem?

pickard1 commented 6 months ago

Sorry I have to ask again. But is there now a plan for when you will work on the topic.

ianlancetaylor commented 6 months ago

Thanks. It took me a while, but your suggested patch is correct. I've committed it.