loongson / la-abi-specs

25 stars 6 forks source link

Add R_LARCH_JIRL_LO12 #3

Closed xen0n closed 1 year ago

xen0n commented 1 year ago

Continuing the (unstarted) discussion at loongson/LoongArch-Documentation#69 where the repo itself is archived but ongoing discussion around it is still happening: e.g. at https://reviews.llvm.org/D138135 (have to justify the workarounds for R_LARCH_PCALA_LO12).

xen0n commented 1 year ago

The only other information at the original site is:

My reply (at 2023-01-22 10:57, UTC+8):

For the record: this is causing significant pain for the LLD port, where relocation semantics (the RelExpr enum) has to be determined early and ideally without depending on other relocs/input content. It is somewhat easy to treat R_LARCH_PCALA_LO12 on jirl differently, but it's much more difficult to differentiate between R_LARCH_PCALA_HI20's that produce intermediate result for a jirl and those not, because we're unlike RISC-V where the R_RISCV_PCREL_LO12's actually point to the corresponding HI20 reloc so correspondence is preserved.

cloudspurs commented 1 year ago

For medium code model function call, we plan to change to pcaddu18i + jirl. And two new relocations R_LARCH_CALL_HI20, R_LARCH_CALL_LO16.

xen0n commented 1 year ago

+1 on pcaddu18i + jirl (apparently pcaddu18i was created just for this purpose).

However I do recommend making the reloc names clearer with respect to the PC-relative semantics: R_LARCH_CALL_PCREL_HI20 and R_LARCH_CALL_PCREL_LO16 could be better, as the existing psABI v2 relocations are all following PC-aligned semantics (which means the calculation is entirely different).

SixWeining commented 1 year ago

If we can ensure pcaddu18i is close to jirl, could we only add one reloc type R_LARCH_B36 or something else, e.g. R_LARCH_CALL36, or R_LARCH_JUMP36, and the linker applies relocations on the adjacent 2 insns in one time.

xen0n commented 1 year ago

If we can ensure pcaddu18i is close to jirl, could we only add one reloc type R_LARCH_B36 or something else, e.g. R_LARCH_CALL36, or R_LARCH_JUMP36, and the linker applies relocations on the adjacent 2 insns in one time.

I think the two parts may be scheduled so that they're no longer adjacent. Unless we have some kind of "macro-op fusion" guarantees/recommendations put into the specs, in which case compilers could be willing to avoid scheduling these.

Alternatively, we could investigate whether the RISC-V approach of allowing one reloc to point to another reloc -- allowing to properly track associations and differentiate behavior based on the associated reloc. That way I imagine several relocs could be merged or "fixed" (see the LLD port code for some of the problems caused by sharing reloc type but not having proper association) compared to what we have right now.

SixWeining commented 1 year ago

I think this can be closed after #4 since we have R_LARCH_CALL36 now.

xen0n commented 1 year ago

I think this can be closed after #4 since we have R_LARCH_CALL36 now.

I have thought about this in the meantime, and it's probably fair to require the medium code model indirect call sequence to be adjacent insns, because then any micro-architecture optimization would become easier.

Given we cannot ditch the R_LARCH_PCALA_LO12 hack without breaking compatibility, the #4 addition of R_LARCH_CALL36 seems the next best / only possible thing to have. I agree this issue could be closed, thanks.

xry111 commented 11 months ago

Unfortunately we may still want R_LARCH_JIRL_LO12 for extreme code model. See https://github.com/rui314/mold/issues/1131#issuecomment-1770558087.