rust-embedded / riscv

Low level access to RISC-V processors
818 stars 160 forks source link

`riscv-rt`: Broken eh_frame relocations on QEMU #196

Open dreiss opened 5 months ago

dreiss commented 5 months ago

Demo repo: https://github.com/dreiss/panic_repro . It's a fairly simple riscv-rt example targeting riscv64 on qemu.

On "rustc 1.78.0-nightly (a84bb95a1 2024-02-13)" or later, the link fails with bad relocations in eh_records:

  = note: rust-lld: error: <internal>:(.eh_frame+0x1c): relocation R_RISCV_32_PCREL out of range: 2147489994 is not in [-2147483648, 2147483647]; references ''
          >>> defined in /home/dreiss/.rustup/toolchains/nightly-2024-02-14-x86_64-unknown-linux-gnu/lib/rustlib/riscv64gc-unknown-none-elf/lib/libcore-2678f83f395a7a3f.rlib(core-2678f83f395a7a3f.core.c4f53ede227ad4d2-cgu.0.rcgu.o)

          rust-lld: error: <internal>:(.eh_frame+0x30): relocation R_RISCV_32_PCREL out of range: 2147489980 is not in [-2147483648, 2147483647]; references ''
          >>> defined in /home/dreiss/.rustup/toolchains/nightly-2024-02-14-x86_64-unknown-linux-gnu/lib/rustlib/riscv64gc-unknown-none-elf/lib/libcore-2678f83f395a7a3f.rlib(core-2678f83f395a7a3f.core.c4f53ede227ad4d2-cgu.0.rcgu.o)

...
          rust-lld: error: too many errors emitted, stopping now (use --error-limit=0 to see all errors)

On "rustc 1.78.0-nightly (b381d3ab2 2024-02-12)" or newer, the link succeeds, and debug_frame records for functions in my code are fine, but eh_frame records for functions in libcore are pointing to addresses around 0xffffffff8000ABCD (confirmed with objdump -WFL and llvm-readelf --unwind). This prevents gdb from backtracing up through libcore functions in some cases.

I think the problem is that riscv-rt's linker script is marking the eh_frame section as "(INFO)", which makes it non-allocatable, which puts it at a logical address of 0. Which means any from it to the code (of which there are, of course, many) are (potentially) longer than 2GB, which is not allowed. The new rustc correctly fails. The old one (which is also an older llvm) messes up the relocation due to overflow.

Making eh_frame not defined as "(INFO)" should fix this, but then it will be included in the output binary, which is not ideal for small targets. If unwinding data is not going to be used at runtime, it's probably best to use debug_frame instead of eh_frame, but that cannot be controlled for libcore unless it's being built from source.

romancardenas commented 5 months ago

Thanks! I'll take a look ASAP. In the mean time, could you confirm that this issue persists with the master branch of this repo?

jasonwhite commented 5 months ago

I can confirm that this is still a problem on the master branch (currently 27c4faf40da4b1c47244b9504442a5fe12827f39).

jasonwhite commented 5 months ago

This patch fixes the link error (and gdb backtraces), but seems like the wrong approach:

diff --git a/riscv-rt/link.x.in b/riscv-rt/link.x.in
index 720d571..1116eab 100644
--- a/riscv-rt/link.x.in
+++ b/riscv-rt/link.x.in
@@ -91,6 +91,9 @@ SECTIONS
     *(.text .text.*);
   } > REGION_TEXT

+  .eh_frame : { KEEP(*(.eh_frame)) } > REGION_TEXT
+  .eh_frame_hdr : { *(.eh_frame_hdr) } > REGION_TEXT
+
   .rodata : ALIGN(4)
   {
     *(.srodata .srodata.*);
@@ -147,9 +150,6 @@ SECTIONS
   {
     KEEP(*(.got .got.*));
   }
-
-  .eh_frame (INFO) : { KEEP(*(.eh_frame)) }
-  .eh_frame_hdr (INFO) : { *(.eh_frame_hdr) }
 }

 /* Do not exceed this mark in the error messages above  
rmsyn commented 3 weeks ago

This patch fixes the link error (and gdb backtraces)

Just confirmed that this fix works, and the problem still exists on hardware + QEMU.

seems like the wrong approach

Why does it seem wrong to you?