Closed MaelCravero closed 1 month ago
Thanks for raising this! I think this solution is kinda discussed for a while, and also agreed from many different parties, I guess the only only remaining controversy is the vendor ID: build our own list or use JEDEC manufacturer ID like mvendorid
.
Added this into today's psABI meeting agenda, thanks again :)
Using the addend precludes the use of Elf_Rel, which some platforms (Fuschia) use for dynamic relocations. It needs to be encoded in the symbol instead.
Is this suggesting encoding in the symbol in all cases, or being generic enough that Sym (if Elf_Rel) or Sym+Addend (if Elf_Rela; probably the absolute section symbol) results in the "vendor identifier"? I understand the need to cover the Elf_Rel case if we have users that need that (and forbidding such functionality because you don't have addends in your ELF file makes no sense), but I'm not sure of an immediate strong argument of always adding symbols to hold this information if addends are available
One idea for resolve both issue: 1) Organization don't have JEDEC manufacturer ID, 2) REL type relocation.
RELA: Encoding JEDEC manufacturer ID in addend, and 0 for those organization don't have JEDEC manufacturer ID, relocation symbol will point to a special symbol $v<vendor-name>
if addend is 0.
REL: Relocation symbol will point to a special symbol: $jedec0x<jedecid-in-hex>
or $v<vendor-name>
.
I think it would be more straightforward to only use symbols with the vendor name: this way we have a single way to use this relocation for both REL and RELA without needing to differentiate for vendors with or without a JEDEC ID. Also, if a vendor really wants to use their JEDEC ID, they are free to use it in the symbol name.
I am happy with special/dummy symbol only scheme.
@jrtc27 does it address your concern?
Just cc more linker experts to make sure it's not insane to linker implementation.
Some background:
We have only limited relocation can be used within the RISC-V psABI due to RV32, we already divide few portion of relocation to vendor extension, however we didn't have a mechanism to well maintain the vendor relocation, that cause those relocation never go back to upstream, therefore, some vendor extension start upstream, like Core-V, they have few vendor relocation, and I believe they are not along...SiFive has also defined some vendor relocation within downstream, also IIRC CHERRI also defined few relocation...so having a mechanism defined in psABI would be great.
@rui314 @MaskRay
CHERI is less relevant here; as a custom research extension we have no need to coexist with other vendor extensions, and the standard extension we're working towards now in the CHERI SIG+TG won't be a vendor extension so will coexist just fine.
It seems the fundamental issue here is that the r_type
member is just too small in 32 bits. So, as a workaround, maybe we can require a vendor-specific relocation to be represented as a pair of relocations instead of just a single relocation? This effectively extends the vendor-specific range from 2^6 (64) to 2^14 (16384), allowing vendors to use non-overlapping relocation types.
(This idea was inspired by the concept of UTF-16 surrogates.)
We cannot change the meaning of any of the existing relocations reserved for custom use. The whole point of reserving them for custom use like that is so that others can rely on being able to use them in perpetuity.
Then how about defining [176, 191] as "vendor-specific relocation prefixes"? That allows 16*256=4096 new vendor-specific relocations.
How are you encoding all vendor relocations in 12 bits?
Actually we can just reserve a single relocation number, say, 191, as a vendor-specific prefix and use UTF-8-like encoding scheme by using following relocations to represent any number of relocation types, eliminating the need to use different relocation numbers for RV32 and RV64 for those who need more than 64 relocation types.
Alternative approach would be to define 191 as the first relocation of a vendor-specific relocation pair, whose actual r_type
is encoded in r_addend
. The second relocation would contain actual data for the vendor-specific relocation (except r_type
).
So, my proposal is to resolve the issue that RV32's r_type
being too small, instead of resolving conflicts in the small space by adding vendor tags. It could be achieved by adding a clause like this to the psABI:
"A relocation whose r_type
is greater than 255 is represented using two consecutive relocation records in RV32. The fist relocation has r_type
of 191, and the actual relocation type is encoded in its r_offset
. Other members, namely r_offset
, r_sym
and r_addend
are taken from the second relocation."
And then we can reserve a plenty space (say, 0x8000'0000 to 0xffff'ffff) for vendor specific relocations, allowing vendors to use non-overlapping regions.
How does this sound to you guys?
You need r_offset to point at the right location otherwise every single tool will need to have special casing in what is normally highly generic code.
We can use r_addend
instead.
We can use
r_addend
instead.
https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/423#pullrequestreview-1862994510
I think REL disallowing r_addend as an identifier is less relevant since there is no requirement to make REL work.
People may not need it. I added -z rel
to ld.lld and Fuchsia seems to use it, but it is pretty minor size optimization when RELR is utilized.
Do we have ideas how many platform-specific relocations end up as dynamic relocations?
How urgent do people need reserved relocation types in the specification? If my RELLEB proposal https://groups.google.com/g/generic-abi/c/yb0rjw56ORw is adopted, RV32 can use effectively unlimited number of relocation types.
@jrtc27 I agree with MaskRay on Fuchsia. Fuchsia (or anyone) reserves the right to deviate from this psABI, and that's what they are doing for dynamic relocations, but the consequence would also be on them if they choose to do so. We are not obligated to make nonstandard deviations work smoothly with new psABI features. If I were a Fuchsia developer, I'd use r_offset
in the dynamic relocation table instead of r_addend
(which shouldn't be hard), so I don't think that's going to be a problem, but how to implement it is up to them.
@MaskRay :
Sorry for late reply, I was stuck at releasing stuffs.
Thanks you bring RELR and RELLEB(CREL), it's my first time know that, I think both are useful for code size reduction around the relocation info, I got few code size request around the kernel module (relocatable files), so I was considering to introducing mixing REL
and RELA
(*1
), but I realized RELLEB
(CREL
) could do better here, anyway I am happy to see this happen, let me know if I can help anything here.
*1
: AArch64 ABI allow mixing REL and RELA[1]
[1] https://github.com/ARM-software/abi-aa/blob/main/aaelf64/aaelf64.rst#addends-and-pc-bias
Back to the the vendor relocation, I am inclined to using dummy symbol scheme at this moment, since I think new relocation format may still require more than one release cycle to implement that into all tools, however vendor instruction has merged into upstream recently, I don't want to block this behind that.
At the end, let me emphasize again, I am really looking forward to RELR and RELLEB(CREL) can improve the code size issue, I saw RELR already merged into glibc since 2.36, so I gonna to prepare some PR to mention that in the RISC-V psABI :)
Updated to use symbols instead of the addend as vendor ID.
No objection to vendor relocations taking one type code.
However, the inclusion of a relocation symbol for vendor identification raises some questions.
Why does a vendor relocation need to identify the vendor name? A vendor identifier table could introduce maintenance overhead, as vendors might register even without immediate need, especially considering the large number of RISC-V vendors.
We reserve one code so that vendor relocations will not conflict with standard relocations. Ensuring two vendors can exchange relocatable files might not be a goal of this specification. I am also concerned with toolchain implementation overhead (I especially care about LLVM binary utilities) to recognize these vendor names.
That said, vendor identification might be useful for other purposes and we probably need a per-file mechanism. With that, do we really need a per-relocation identification?
Ensuring two vendors can exchange relocatable files might not be a goal of this specification.
Unfortunately that has been expressed very explicitly as a goal, hence how we've ended up here.
We reserve one code so that vendor relocations will not conflict with standard relocations. Ensuring two vendors can exchange relocatable files might not be a goal of this specification. I am also concerned with toolchain implementation overhead (I especially care about LLVM binary utilities) to recognize these vendor names.
That said, vendor identification might be useful for other purposes and we probably need a per-file mechanism. With that, do we really need a per-relocation identification?
We may have multi-version function for different vendor, so that may contain relocation from more than one vendor in theory.
Can you share an example of how vendor relocations from various vendors are combined? Are we comfortable committing to long-term maintenance now, or should we gain more experience first?
What actions should upstream linkers take? I think it's crucial that they avoid recognizing vendor-specific semantics.
(A dummy symbol, if ever adopted, can be implemented with STB_WEAK STV_HIDDEN so that after linking there is only one copy.)
@MaskRay
Are we comfortable committing to long-term maintenance now, or should we gain more experience first?
What actions should upstream linkers take? I think it's crucial that they avoid recognizing vendor-specific semantics.
RISC-V community agree taking vendor-specific extensions, so that may introduce new relocation you could imagine :P, but require vendor to maintain that, one agreement here is that require an open spec on toolchian-convention, give some piratical example is we have vendor extension from T-head, Core-V (OpenHW), Ventana and SiFive on upstream now.
Can you share an example of how vendor relocations from various vendors are combined?
So back to this question, because we allow vendor upstream their extension, so yes they may contain within single file.
@MaskRay any further concern?
This expands the current support for custom relocations to have a vendor identifier which would allow to avoid conflicts between vendor relocations in tools. This would also allow a greater number of relocations, such that these could eventually be supported in upstream tools.
A proof of concept implementation can be found in https://github.com/embecosm/corev-binutils-gdb-vendor-relocs.