Closed davidlattimore closed 2 days ago
I would like to try implementing this feature ;)
Debug info is made of a collection of sections that are interlinked and concatenated:
The format heavily depends on relocations:
$ gcc ~/Programming/main.c -g -c
$ readelf -r main.o
Relocation section '.rela.debug_info' at offset 0x3c0 contains 8 entries:
Offset Info Type Sym. Value Sym. Name + Addend
000000000008 00040000000a R_X86_64_32 0000000000000000 .debug_abbrev + 0
00000000000d 00060000000a R_X86_64_32 0000000000000000 .debug_str + 5
000000000012 00070000000a R_X86_64_32 0000000000000000 .debug_line_str + 5
000000000016 00070000000a R_X86_64_32 0000000000000000 .debug_line_str + 0
00000000001a 000200000001 R_X86_64_64 0000000000000000 .text + 0
00000000002a 00050000000a R_X86_64_32 0000000000000000 .debug_line + 0
00000000002f 00060000000a R_X86_64_32 0000000000000000 .debug_str + 0
00000000003a 000200000001 R_X86_64_64 0000000000000000 .text + 0
Relocation section '.rela.debug_aranges' at offset 0x480 contains 2 entries:
Offset Info Type Sym. Value Sym. Name + Addend
000000000006 00030000000a R_X86_64_32 0000000000000000 .debug_info + 0
000000000010 000200000001 R_X86_64_64 0000000000000000 .text + 0
Relocation section '.rela.debug_line' at offset 0x4b0 contains 5 entries:
Offset Info Type Sym. Value Sym. Name + Addend
000000000022 00070000000a R_X86_64_32 0000000000000000 .debug_line_str + 25
000000000026 00070000000a R_X86_64_32 0000000000000000 .debug_line_str + 2a
000000000030 00070000000a R_X86_64_32 0000000000000000 .debug_line_str + 43
000000000035 00070000000a R_X86_64_32 0000000000000000 .debug_line_str + 4a
00000000003f 000200000001 R_X86_64_64 0000000000000000 .text + 0
Relocation section '.rela.eh_frame' at offset 0x528 contains 1 entry:
Offset Info Type Sym. Value Sym. Name + Addend
000000000020 000200000002 R_X86_64_PC32 0000000000000000 .text + 0
The implementation should include the following steps:
objcopy --decompress-debug-sections
)0
or 1
value depending on the section name). Can you help me how to integrate that into the GC algorithm?OutputSectionId::regular
)? Or do we want to use a different mechanism?I would like to try implementing this feature ;)
Great! Thanks!
3. GC of the sections should not be inhibited by a debug-info section and the corresponding relocation of a debug-info section should use a tombstone value (
0
or1
value depending on the section name). Can you help me how to integrate that into the GC algorithm?
Do you know if we need to split the sections then GC the parts of the sections that are for functions that get GCed? I had to do that with the .eh_frame
support and it certainly added some complexity.
If you don't need to split the sections up and you also don't want relocations in those sections to prevent other things from being GCed, then you can possibly skip reading the relocations for those sections during the layout phase. e.g Section::create
could skip the bit where it iterates over relocations. That also assumes that the relocations don't need any allocations - i.e. that none of the relocations need to be turned into runtime relocations.
4. Should we list all the supported debug-info section names (~20 sections for DWARF 4 and 5) (
OutputSectionId::regular
)? Or do we want to use a different mechanism?
If you need to refer to the sections from code - e.g. if you need special logic for a particular section, then yes. Making the sections "regular" sections means that they'll be split by alignment. I guess if a particular section needs to only ever have one particular alignment, then we could use a "generated" section - although it'd make the name "generated" slightly less appropriate.
If you don't need special logic for debug sections, or if the only special logic is that you want to not process the relocations during the layout (GC) phase, then you could add a field on SectionDetails
- like is_debug_info
.
Do you know if we need to split the sections then GC the parts of the sections that are for functions that get GCed? I had to do that with the
.eh_frame
support and it certainly added some complexity.
No, for debug info sections, we'll only need to make the relocation resolution (with exception of .debug_str
and .debug_line_str
which can be candidate for string merging).
That also assumes that the relocations don't need any allocations - i.e. that none of the relocations need to be turned into runtime relocations.
Yep, that's how I understand it as debug info is never read by a dynamic linker. It's the consumer like gdb
or valgrind
which loads the debug info from a (potentially different) file.
If you don't need special logic for debug sections, or if the only special logic is that you want to not process the relocations during the layout (GC) phase, then you could add a field on
SectionDetails
- likeis_debug_info
.
Ok, I'm going to start with the suggested approach. Thanks.
Please close this as implemented.
I'm not 100% sure what's involved here. I gather that eh_frame info, which we already support, is somewhat related to, or a more limited form of dwarf debug info.