Open patrick-rivos opened 4 weeks ago
CC @MaskRay
A fixup describes a modified section location.
If the assembler decides to generate a relocation for a fixup, the content bits can be kept as zero like LLVM does and non-internal branches like call foo
on x86.
It seems that GNU assembler's riscv port modifies the location.
LLVM's choice is simpler and enables better compression. We can use llvm-objdump -dr
to dump inline relocations. Therefore, I am not sure we want to change LLVM integrated assembler.
@llvm/issue-subscribers-backend-risc-v
Author: Patrick O'Neill (patrick-rivos)
We can use
llvm-objdump -dr
to dump inline relocations. Therefore, I am not sure we want to change LLVM integrated assembler.
Thanks for the reply. Just a note that -dr
also requires -x
if you want to be able to see the branch destination. The label is not displayed inline with the disassembled code.
llvm-objdump -drx map.o | grep "L0"
0000000000000014 l .text 0000000000000000 .L0
000000000000002a l .text 0000000000000000 .L0
000000000000000c: R_RISCV_BRANCH .L0
0000000000000022: R_RISCV_BRANCH .L0
non-internal branches like call foo on x86.
In my opinion those are easier to reason about since the relocation objdumps to include the function name so I don't need to cross-reference the headers: https://godbolt.org/z/84jKoqW5o
LLVM's choice is simpler and enables better compression.
From LLVM's side it seems simpler to implement but the user experience of chasing down branch destinations is not simple (at least the way I'm doing it :-) ).
I'm speaking out of ignorance here - maybe I don't understand the magnitude of the savings here or how often unlinked obj files are compressed - but compression seems like strange justification for this behavior.
Using the same input assembly file from:
gnu-as assembled program looks normal:
But LLVM's obj file looks like: (each branch is an infinite loop)
gnu-objdump and llvm-objdump agree on this output so it's probably not on the objdump side of things.
When actually linked the offsets are updated to the correct addresses (with both lld/ld):
I think the actual dests in the LLVM obj file are encoded using R_RISCV_BRANCH.
This behavior makes inspecting unlinked object files unintuitive at first glance.