Closed xen0n closed 1 year ago
The only other information at the original site is:
My reply (at 2023-01-22 10:57, UTC+8):
For the record: this is causing significant pain for the LLD port, where relocation semantics (the
RelExpr
enum) has to be determined early and ideally without depending on other relocs/input content. It is somewhat easy to treatR_LARCH_PCALA_LO12
onjirl
differently, but it's much more difficult to differentiate betweenR_LARCH_PCALA_HI20
's that produce intermediate result for ajirl
and those not, because we're unlike RISC-V where theR_RISCV_PCREL_LO12
's actually point to the correspondingHI20
reloc so correspondence is preserved.
For medium code model function call, we plan to change to pcaddu18i + jirl. And two new relocations R_LARCH_CALL_HI20, R_LARCH_CALL_LO16.
+1 on pcaddu18i + jirl
(apparently pcaddu18i
was created just for this purpose).
However I do recommend making the reloc names clearer with respect to the PC-relative semantics: R_LARCH_CALL_PCREL_HI20
and R_LARCH_CALL_PCREL_LO16
could be better, as the existing psABI v2 relocations are all following PC-aligned semantics (which means the calculation is entirely different).
If we can ensure pcaddu18i
is close to jirl
, could we only add one reloc type R_LARCH_B36
or something else, e.g. R_LARCH_CALL36
, or R_LARCH_JUMP36
, and the linker applies relocations on the adjacent 2 insns in one time.
If we can ensure
pcaddu18i
is close tojirl
, could we only add one reloc typeR_LARCH_B36
or something else, e.g.R_LARCH_CALL36
, orR_LARCH_JUMP36
, and the linker applies relocations on the adjacent 2 insns in one time.
I think the two parts may be scheduled so that they're no longer adjacent. Unless we have some kind of "macro-op fusion" guarantees/recommendations put into the specs, in which case compilers could be willing to avoid scheduling these.
Alternatively, we could investigate whether the RISC-V approach of allowing one reloc to point to another reloc -- allowing to properly track associations and differentiate behavior based on the associated reloc. That way I imagine several relocs could be merged or "fixed" (see the LLD port code for some of the problems caused by sharing reloc type but not having proper association) compared to what we have right now.
I think this can be closed after #4 since we have R_LARCH_CALL36
now.
I think this can be closed after #4 since we have
R_LARCH_CALL36
now.
I have thought about this in the meantime, and it's probably fair to require the medium code model indirect call sequence to be adjacent insns, because then any micro-architecture optimization would become easier.
Given we cannot ditch the R_LARCH_PCALA_LO12
hack without breaking compatibility, the #4 addition of R_LARCH_CALL36
seems the next best / only possible thing to have. I agree this issue could be closed, thanks.
Unfortunately we may still want R_LARCH_JIRL_LO12
for extreme code model. See https://github.com/rui314/mold/issues/1131#issuecomment-1770558087.
Continuing the (unstarted) discussion at loongson/LoongArch-Documentation#69 where the repo itself is archived but ongoing discussion around it is still happening: e.g. at https://reviews.llvm.org/D138135 (have to justify the workarounds for
R_LARCH_PCALA_LO12
).