Closed marxin closed 2 months ago
Oh, the strings can have variable length, but a subsequent string always starts at the declared alignment offset:
CMakeFiles/clang.dir/driver.cpp.o: file format elf64-x86-64
Contents of section .rodata._ZL14ExecuteCC1ToolRN4llvm15SmallVectorImplIPKcEERKNS_11ToolContextE.str1.8:
0000 6572726f 723a2075 6e6b6e6f 776e2069 error: unknown i
0010 6e746567 72617465 6420746f 6f6c2027 ntegrated tool '
0020 00000000 00000000 56616c69 6420746f ........Valid to
0030 6f6c7320 696e636c 75646520 272d6363 ols include '-cc
0040 31272c20 272d6363 31617327 20616e64 1', '-cc1as' and
0050 20272d63 63316765 6e2d7265 70726f64 '-cc1gen-reprod
0060 75636572 272e0a00 ucer'...
which means we would have to introduce MergeStringsSection
for each alignment for each output section. That sounds similar to part_id_with_alignment
and UnresolvedMergeStringsFileSection::new
should take into account the alignment and append multiple '\0' characters (if necessary). Then the merging should work as now and it might work :)
That sounds similar to
part_id_with_alignment
andUnresolvedMergeStringsFileSection::new
should take into account the alignment and append multiple '\0' characters (if necessary). Then the merging should work as now and it might work :)
Sounds like you've got things mostly figured out. I suspect you might also need to change MergeStringsSection::add_string
. It's what allocates addresses to each string that gets added to the section. So it would need to take alignment into account.
Yep, that's yet another place that needs to be changed.
To be honest, we're speaking about a small saving we can gain, so I tend to close this PR for now.
The most important section from the string merging perspective is definitely .debug_str
which has a proper alignment we deal with (10.
This is probably a naive implementation, but still something we can discuss. I noticed the
clang
binary contains a string sectionWild
can't deal with right now:0.7% 1.51Mi 0.9% 1.51Mi .rodata.str1.8
. As the name suggests, the section's alignment is equal to 8. My patch causes crashes for various binaries.Why is the non-one alignment used for strings as the strings have variadic lengths and so you don't have any guarantees about an alignment of a reference string in such a section? Or am I wrong?