llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.14k stars 12.02k forks source link

s390/clang: relocation error : when there is a weak undefined symbol + -fno-PIE and -munaligned-symbols #83265

Open sumanthkorikkar opened 8 months ago

sumanthkorikkar commented 8 months ago

Problem:

When the symbol is weak and undefined + when the section layout is > 4GB, then it can end up with relocation error.

sample.c:6:(.text+0xc): relocation truncated to fit: R_390_PC32DBL against undefined symbol `kallsyms_markers'

Reproducer:

  1. cat sample.c extern const char kallsyms_names[] attribute((weak)); extern const unsigned int kallsyms_markers[] attribute((weak)); const char *data; int main(void) {       data = &kallsyms_names[kallsyms_markers[0]]; }

  2. clang -g -c sample.c -munaligned-symbols -fno-PIE

  3. cat layout.ld SECTIONS { . = 0x200000000; }

  4. ld --script=layout.ld -o sample sample.o sample.o: in function main': sample.c:6:(.text+0xc): relocation truncated to fit: R_390_PC32DBL against undefined symbolkallsyms_markers'

Disassembly:

Disassembly of section .text:

0000000000000000 <main>: extern const char kallsyms_names[] __attribute__((weak)); extern const unsigned int kallsyms_markers[] __attribute__((weak)); const char *data; int main(void) { 0: eb bf f0 58 00 24 stmg %r11,%r15,88(%r15) 6: b9 04 00 bf lgr %r11,%r15       data = &kallsyms_names[kallsyms_markers[0]]; a: c4 0e 00 00 00 00 llgfrl %r0,a <main+0xa> c: R_390_PC32DBL kallsyms_markers+0x2 <<<< 10: c4 18 00 00 00 00 lgrl %r1,10 <main+0x10> 12: R_390_GOTENT kallsyms_names+0x2 16: b9 08 00 01 agr %r0,%r1 1a: c4 0b 00 00 00 00 stgrl %r0,1a <main+0x1a> 1c: R_390_PC32DBL data+0x2 20: a7 29 00 00 lghi %r2,0 } 24: eb bf b0 58 00 04 lmg %r11,%r15,88(%r11) 2a: 07 fe br %r14

When .text layout is > 4GB and when kallsyms_markers is weak undefined, then it could end up with relocation truncated to fit: R_390_PC32DBL against undefined symbol `kallsyms_markers'

Misc:

gcc doesnt have this problem with -mualigned-symbols and works fine with the same example

Thank you, Sumanth

sumanthkorikkar commented 8 months ago

cc: @uweigand @JonPsson @JonPsson1

uweigand commented 8 months ago

kallsyms_markers is always 4-byte aligned due to the int data type. Why should -munaligned-symbols make any difference in how this symbol is accessed?

The general assumption for -fno-PIE code is that 0 is always in reach, specifically so that a weak undefined symbol can be resolved to zero. This means that -fno-PIE code always must be mapped below 4 GB. That's always been a requirement ...

sumanthkorikkar commented 8 months ago

Hi Ulrich,

eg: This approach allows using gdb with just file vmlinux.

uweigand commented 8 months ago

Again, it always has been a requirement to use a base address of < 4GB for position-dependent code. If that happens to work with some GCC versions / flags, it's just an accident. If a base address of > 4GB is required, we need a proper solution for this.

IMO the simplest approach would be to just compile as position-independent code then, which will force the compiler to insert the necessary GOT indirections. (You can still link as position-dependent, which might should eliminate now-unnecessary GOT slots and relocations.) If this doesn't work or is suboptimal, maybe we can investigate this in more detail.

Another option might be to use a different memory model (either -mcmodel=large, or maybe -mcmodel=kernel) - but this might have to be implemented and/or fixed in the compilers first, since it hasn't really been used so far ...