Closed dfef8e66-2662-460a-96b9-d5b30d268030 closed 5 years ago
Committed revision r347854 to give an error message if _GLOBAL_OFFSETTABLE is redefined by user code.
Submitted https://reviews.llvm.org/D54624 for review to make redefining _GLOBAL_OFFSETTABLE an error message.
I feel reporting a user-defined _GLOBAL_OFFSETTABLE as a duplicate symbol is better because if a program defines the symbol, it is perhaps not intended and a likely just a programmer's mistake. But if just redefining _GLOBAL_OFFSETTABLE is easier, that's fine too.
The following example shows the difference between gnu ld and LLD
.syntax unified
.text
.Lsym: ldr r3, =_start(got) bx lr .word _GLOBAL_OFFSETTABLE - (.Lsym+8)
.globl _start
.type _start, %function
_start: bx lr
.bss
.global _GLOBAL_OFFSET_TABLE_
.type _GLOBAL_OFFSET_TABLE_, %object
_GLOBAL_OFFSETTABLE: .space 4
This happens to generate with llvm-mc got.s -triple=armv7a-linux-gnueabihf -filetype=obj -o got.o Relocation section '.rel.text' at offset 0xb8 contains 2 entries: Offset Info Type Sym.Value Sym. Name 00000008 00000503 R_ARM_REL32 00000000 _GLOBAL_OFFSETTABLE 00000010 0000061a R_ARM_GOT_BREL 0000000c _start
When linked with lld got.o -o got.axf -static --print-map we get:
VMA LMA Size Align Out In Symbol
11000 11000 14 4 .text
11000 11000 14 4 got.o:(.text)
11000 11000 0 1 $a.0
11008 11008 0 1 $d.1
1100c 1100c 0 1 $a.2
1100c 1100c 0 1 _start
11010 11010 0 1 $d.3
12000 12000 4 4 .got
12000 12000 4 4
Note that _GLOBAL_OFFSETTABLE is defined in the .bss section.
With ld.bfd ... got 0x0000000000020088 0x10 (.got.plt) .got.plt 0x0000000000020088 0xc got.o 0x0000000000020088 _GLOBAL_OFFSETTABLE (.igot.plt) .igot.plt 0x0000000000020094 0x0 got.o (.got) .got 0x0000000000020094 0x4 got.o (.igot)
.data 0x0000000000020098 0x0 0x0000000000020098 PROVIDE (__data_start, .) (.data .data. .gnu.linkonce.d.*)
.data1 *(.data1) 0x0000000000020098 _edata = . 0x0000000000020098 PROVIDE (edata, .) 0x0000000000020098 . = . 0x0000000000020098 bss_start = . 0x0000000000020098 __bss_start = .
.bss 0x0000000000020098 0x4 ... Note that _GLOBAL_OFFSETTABLE has been redefined to be the start of the .got section.
On x86 at least ld.gold gives an error message for a duplicate symbol error.
It is looking like we should either follow ld.gold and give an error message if the user program defines _GLOBAL_OFFSETTABLE, which is better than a broken program. Or follow ld.bfd and just redefine it to the base of the .got.
Can I ask why there is a definition of _GLOBAL_OFFSETTABLE in .bss._GLOBAL_OFFSETTABLE?
I did find a difference between LLVM MC and GNU as some time ago where .word _GLOBAL_OFFSETTABLE - (.Lsym+8) emits R_ARM_REL32 to _GLOBAL_OFFSETTABLE whereas on GNU as it is R_ARM_BASE_PREL which ignores the actual value of _GLOBAL_OFFSETTABLE and just uses the base of the .got section.
I put a fix up https://reviews.llvm.org/D46319 but couldn't find a reviewer. Perhaps it is worth another go.
Best guess so far is that this is related to the use of _GLOBAL_OFFSETTABLE in .text.start, and a surprising definition of it in lto.tmp:(.bss._GLOBAL_OFFSETTABLE)
Disassembly of section .text._start:
00000000 <_start>: 0: e3a0b000 mov fp, #0 4: e3a0e000 mov lr, #0 8: e49d1004 pop {r1} ; (ldr r1, [sp], #4) c: e1a0200d mov r2, sp 10: e1a0b000 mov fp, r0 14: e59f4038 ldr r4, [pc, #56] ; 54 <.l4>
00000018 <.l4a>:
18: e08f4004 add r4, pc, r4
1c: e59f0024 ldr r0, [pc, #36] ; 48 <.l1>
20: e7900004 ldr r0, [r0, r4]
24: e59f3020 ldr r3, [pc, #32] ; 4c <.l2>
28: e7933004 ldr r3, [r3, r4]
2c: e59fc01c ldr ip, [pc, #28] ; 50 <.l3>
30: e79cc004 ldr ip, [ip, r4]
34: e52d2004 push {r2} ; (str r2, [sp, #-4]!)
38: e52db004 push {fp} ; (str fp, [sp, #-4]!)
3c: e52dc004 push {ip} ; (str ip, [sp, #-4]!)
40: ebfffffe bl 0 <__libc_start_main>
40: R_ARM_CALL __libc_start_main
44: ebfffffe bl 0
00000048 <.l1>: 48: 00000000 .word 0x00000000 48: R_ARM_GOT32 __elements_entry_point_helper
0000004c <.l2>: 4c: 00000000 .word 0x00000000 4c: R_ARM_GOT32 __elements_init
00000050 <.l3>: 50: 00000000 .word 0x00000000 50: R_ARM_GOT32 __elements_fini
00000054 <.l4>: 54: 00000034 .word 0x00000034 54: R_ARM_REL32 _GLOBAL_OFFSETTABLE 58: e12fff1e bx lr
Where R_ARM_GOT32 is the same code as R_ARM_GOT_BREL in the ABI http://infocenter.arm.com/help/topic/com.arm.doc.ihi0044f/IHI0044F_aaelf.pdf
The resolution of the relocation is defined to be GOT(S) + A – GOT_ORG where GOT(S) is the address of the GOT entry for the symbol S. GOT_ORG is the addressing origin of the Global Offset Table
So far so good. The linker is expected to define _GLOBAL_OFFSETTABLE to be the base of the .got section. Unfortunately there already seems to be a _GLOBAL_OFFSETTABLE defined in .bss._GLOBAL_OFFSETTABLE so we get a .got section created to resolve the R_ARM_GOT_BREL/R_ARM_GOT32 relocations which at a cursory glance look like they have been resolved sensibly. Unfortunately _GLOBAL_OFFSETTABLE is defined in .bss._GLOBAL_OFFSETTABLE so when the value of the offsets from the base of the .got are added to _GLOBAL_OFFSETTABLE they point at somewhere random in the .bss section.
I haven't yet worked out why this works in GNU LD as the examples use of LTO makes this a bit more complicated. One possibility is that GOT_ORG is resolved to be _GLOBAL_OFFSETTABLE regardless of where the .got is created. This could work although the offsets from the base of the .got might be at risk of going out of range.
Note that the use of _GLOBAL_OFFSETTABLE is a poorly documented linker/OS convention and not defined in ELF so it is difficult to say whether this is an unsupported use of _GLOBAL_OFFSETTABLE or a bug in our emulation of GNU ld's behaviour.
It is worth mentioning that the example uses LTO so I've used -save-temps to obtain the ELF object file. I will try and come up with a simpler example that can be uploaded here.
I can confirm that this fails in qemu during startup libc_start_main IN: libc_start_main 0x0006a510: ee1d 3f70 mrc 15, 0, r3, cr13, cr0, {3} 0x0006a514: f5a3 6398 sub.w r3, r3, #1216 ; 0x4c0 0x0006a518: f649 028c movw r2, #39052 ; 0x988c 0x0006a51c: f2c0 0212 movt r2, #18 ; 0x12 0x0006a520: f8d3 6080 ldr.w r6, [r3, #128] 0x0006a524: ac06 add r4, sp, #24 0x0006a526: 6fdd ldr r5, [r3, #124] 0x0006a528: f8c3 4080 str.w r4, [r3, #128] 0x0006a52c: 9902 ldr r1, [sp, #8] 0x0006a52e: 9801 ldr r0, [sp, #4] 0x0006a530: 6812 ldr r2, [r2, #0] 0x0006a532: 9b03 ldr r3, [sp, #12] 0x0006a534: 9648 str r6, [sp, #288] 0x0006a536: 9549 str r5, [sp, #292] 0x0006a538: 4798 blx r3
Linking TBs 0x55d7db548f80 [0006a50e] index 1 -> 0x55d7db548ff0 [0006a510] Trace 0x55d7db548ff0 [0: 0006a510] __libc_start_main R00=00000000 R01=00000000 R02=001fb8d7 R03=0012a030 R04=00013de4 R05=00000000 R06=0010e8bc R07=00013de4 R08=0012a030 R09=00000000 R10=00109000 R11=00000000 R12=f6ffeb80 R13=f6ffeb00 R14=0006a50f R15=0006a510 PSR=40000030 -Z-- T usr32
Will need to investigate further to see why this is occuring.
Extended Description
Trying to use lld to create a -Bstatic executable for linux raspberrypi (armv7) creates an invalid executable that crashes at run & fails to open in gdb.
The commandline was lld -flavor gnu "ConsoleApplication907.a" "ConsoleApplication907.o" "Island.a" "libgc.a" libgcc.a libgcc_eh.a libpthread.a librt.a libc.a --eh-frame-hdr -Bstatic -o ConsoleApplication907
If I compile Island.a from bitcode to obj and use gnu ld it does work and run:
ld -( Island.a "libgc.a" libgcc.a libgcc_eh.a libpthread.a librt.a libc.a -) ConsoleApplication907.o ConsoleApplication907.a Island.a --eh-frame-hdr -Bstatic -o ConsoleApplication907
Neither linux nor anything else gives an indication of what is wrong, but ld has no problem statically linking libc. The qemu linux emulator seems to jump (blx) to _dl_hwcap which is a BSS symbol.
LLD repro file: https://1drv.ms/u/s!Au2nm7P_hgmawnGfF6vCLE7JGT21 (too big to attach)