llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.45k stars 12.17k forks source link

[ARM] `BL $ADDR` becomes `BLX $ADDR` after linking #37783

Closed llvmbot closed 4 years ago

llvmbot commented 6 years ago
Bugzilla Link 38435
Resolution FIXED
Resolved on Mar 22, 2020 22:20
Version unspecified
OS Linux
Attachments Output of --reproduce
Reporter LLVM Bugzilla Contributor
CC @efriedma-quic,@MaskRay,@smithp35

Extended Description

These are the input object files. Observe the instruction at <main+0x2>: bl <__nop>.

$ arm-none-eabi-objdump -Cd app.o | head -n11

app.o:     file format elf32-littlearm

Disassembly of section .text.main:

00000000 <main>:
  0:   be00            bkpt    0x0000
  2:   f7ff fffe       bl      0 <__nop>
  6:   be00            bkpt    0x0000
  8:   e7fd            b.n     6 <main+0x6>

$ arm-none-eabi-objdump -Cd libasm.a
In archive libasm.a:

asm.o:     file format elf32-littlearm

Disassembly of section .text:

00000000 <__nop>:
  0:   4770            bx      lr

When linking with LLD, <main+0x2> becomes blx 8000050. This instruction causes a HardFault (SIGILL like) exception when executed. (AIUI the argument of BLX should be a register, not an address)

$ lld -flavor gnu app.o -o app --gc-sections -L . -Bstatic --whole-archive -lasm --no-whole-archive -Tlink.x -Bdynamic --reproduce repro.tar

$ arm-none-eabi-objdump -Cd app | head -n11

app:     file format elf32-littlearm

Disassembly of section .text:

08000040 <main>:
8000040:       be00            bkpt    0x0000
8000042:       f000 e806       blx     8000050 <DefaultExceptionHandler>
8000046:       be00            bkpt    0x0000
8000048:       e7fd            b.n     8000046 <main+0x6>

When linking with GNU LD, <main+0x2> becomes bl 8000052. This program executes without raising an exception.

$ arm-none-eabi-ld app.o -o app --gc-sections -L . -Bstatic --whole-archive -lasm --no-whole-archive -Tlink.x -Bdynamic

$ arm-none-eabi-objdump -Cd app | head -n11

app:     file format elf32-littlearm

Disassembly of section .text:

08000040 <main>:
8000040:       be00            bkpt    0x0000
8000042:       f000 f806       bl      8000052 <__nop>
8000046:       be00            bkpt    0x0000
8000048:       e7fd            b.n     8000046 <main+0x6>

Version information:

LLVM: https://github.com/llvm-mirror/llvm/commit/0b5d0cfa8e55ac076285efb25e102597751db49c LLD: https://github.com/llvm-mirror/lld/commit/bcfc39dfc8e40fd7744828abfb4ae4f9e69dc32b GNU LD: 2.31

MaskRay commented 4 years ago

link.x requires a small change to work with lld:

/DISCARD/ : {

lld can produce more empty sections than GNU ld. The synthetic section .ARM.exidx needs to be discarded as well even if it has no composing .ARM.exidx.*

The original problem has been fixed by Peter's https://reviews.llvm.org/D73542 Different from GNU ld, lld will issue a warning:

% ld.lld @​response.txt ld.lld: warning: app0-bb017cc3d2efe9d5e6ed62661cfeec84.rs:(function main: .text.main+0x2): branch and link relocation: R_ARM_THM_CALL to non STT_FUNC symbol: nop interworking not performed; consider using directive '.type nop, %function' to give symbol type STT_FUNC if interworking between ARM a nd Thumb is required

llvmbot commented 6 years ago

Thanks for looking into this issue, Peter.

Your .thumb_func suggestion does fix this issue for me.

smithp35 commented 6 years ago

It looks like LLD is being a little too loose in how it interprets bit 0 of a symbol. Strictly speaking interworking should only be performed on symbols with type STT_FUNC.

Taking a look at the disassembly and symbols of libasm.a I see:

asm.o: file format elf32-littlearm

Disassembly of section .text:

00000000 <__nop>: 0: 4770 bx lr

Symbol table '.symtab' contains 7 entries: Num: Value Size Type Bind Vis Ndx Name 0: 00000000 0 NOTYPE LOCAL DEFAULT UND 1: 00000000 0 SECTION LOCAL DEFAULT 1 2: 00000000 0 SECTION LOCAL DEFAULT 2 3: 00000000 0 SECTION LOCAL DEFAULT 3 4: 00000000 0 NOTYPE LOCAL DEFAULT 1 $t 5: 00000000 0 SECTION LOCAL DEFAULT 4 6: 00000000 0 NOTYPE GLOBAL DEFAULT 1 __nop

Usually an untyped symbol is a symptom of programming error as if the program had made a BL from ARM state you would have wanted it to use a BLX.

I suggest using .type __nop, %function or .thumb_func __nop:

For the symbol nop ideally you would want to see from readelf --symbols: 6: 00000001 0 FUNC GLOBAL DEFAULT 1 nop

Note that bit 0 has been set to 1 to indicate Thumb and the type of the symbol is FUNC.

I'll see what I can do with LLD's interworking, from memory the linker only had the symbol value and not the symbol type available to it in order to make the decision.