ARM-software / abi-aa

Application Binary Interface for the Arm® Architecture
Other
938 stars 188 forks source link

[AAELF64] Clarify how addends work in MOVZ, MOVK and ADRP. #271

Closed statham-arm closed 2 months ago

statham-arm commented 3 months ago

This brings AAELF64 into line with AAELF32, which already has a similar clarification for the MOVW+MOVT pair. For the instructions which shift their operand left (ADRP, and the shifted MOVZ and MOVK), if the relocation addend is taken from the input value of the immediate field, it is not treated as shifted.

The rationale is that this allows a sequence of related instructions to consistently compute the same value (symbol + small offset), and cooperate to load that value into the target register, one small chunk at a time. For example, this would load mySymbol + 0x123:

mov x0, #0x123 ; R_AARCH64_MOVW_UABS_G0_NC(mySymbol) movk x0, #0x123, lsl #16 ; R_AARCH64_MOVW_UABS_G1_NC(mySymbol) movk x0, #0x123, lsl #32 ; R_AARCH64_MOVW_UABS_G2_NC(mySymbol) movk x0, #0x123, lsl #48 ; R_AARCH64_MOVW_UABS_G3(mySymbol)

The existing text made it unclear whether the addends were shifted or not. If they are interpreted as shifted, then nothing useful happens, because the first instruction would load the low 16 bits of mySymbol+0x123, and the second would load the next 16 bits of mySymbol+0x1230000, and so on. This doesn't reliably get you any useful offset from the symbol, because the relocations are processed independently, so that a carry out of the low 16 bits won't be taken into account in the next 16.

If you do need to compute a large offset from the symbol, you have no option but to use SHT_RELA and specify a full 64-bit addend: there's no way to represent that in an SHT_REL setup. But interpreting the SHT_REL addends in the way specified here, you can at least specify small addends successfully.

statham-arm commented 3 months ago

Nice thought! I think you're absolutely right: you can make ADRP+ADD consistently generate up to a 19-bit signed offset even though the ADD immediate can only hold 12 bits of it, by just setting the ADD immediate to the low 12 bits. The error in computing the full value during processing of the ADD relocation doesn't matter, because it's only in bits 12 and above, which don't affect the output.

statham-arm commented 3 months ago

Oops! The ADRP immediate is 21 bits, not 19. Almost didn't notice that the encoding has two more immediate bits near the top of the word.

rearnsha commented 3 months ago

Is it really safe to make this change? The current text says:

5.7.2 Addends and PC-bias https://github.com/ARM-software/abi-aa/blob/main/aaelf64/aaelf64.rst#id49 https://github.com/ARM-software/abi-aa/blob/main/aaelf64/aaelf64.rst#572addends-and-pc-bias

A binary file may use REL or RELA relocations or a mixture of the two (but multiple relocations of the same place must use only one type).

The initial addend for a REL-type relocation is formed according to the following rules.

A RELA format relocation must be used if the initial addend cannot be encoded in the place.

So the question is, could existing code be using these fields in a different way (interpreting the bits as scaled by the field encoding information)? If they could, then this change would break existing usage and thus cannot be safely made, even if it would have been more useful in retrospect.

On Mon, 1 Jul 2024 at 17:31, Simon Tatham @.***> wrote:

This brings AAELF64 into line with AAELF32, which already has a similar clarification for the MOVW+MOVT pair. For the instructions which shift their operand left (ADRP, and the shifted MOVZ and MOVK), if the relocation addend is taken from the input value of the immediate field, it is not treated as shifted.

The rationale is that this allows a sequence of related instructions to consistently compute the same value (symbol + small offset), and cooperate to load that value into the target register, one small chunk at a time. For example, this would load mySymbol + 0x123:

mov x0, #0x123 ; R_AARCH64_MOVW_UABS_G0_NC(mySymbol) movk x0, #0x123, lsl #16 https://github.com/ARM-software/abi-aa/issues/16 ; R_AARCH64_MOVW_UABS_G1_NC(mySymbol) movk x0, #0x123, lsl #32 https://github.com/ARM-software/abi-aa/pull/32 ; R_AARCH64_MOVW_UABS_G2_NC(mySymbol) movk x0, #0x123, lsl #48 https://github.com/ARM-software/abi-aa/pull/48 ; R_AARCH64_MOVW_UABS_G3(mySymbol)

The existing text made it unclear whether the addends were shifted or not. If they are interpreted as shifted, then nothing useful happens, because the first instruction would load the low 16 bits of mySymbol+0x123, and the second would load the next 16 bits of mySymbol+0x1230000, and so on. This doesn't reliably get you any useful offset from the symbol, because the relocations are processed independently, so that a carry out of the low 16 bits won't be taken into account in the next 16.

If you do need to compute a large offset from the symbol, you have no option but to use SHT_RELA and specify a full 64-bit addend: there's no way to represent that in an SHT_REL setup. But interpreting the SHT_REL addends in the way specified here, you can at least specify small addends successfully.

You can view, comment on, or merge this pull request online at:

https://github.com/ARM-software/abi-aa/pull/271 Commit Summary

File Changes

(1 file https://github.com/ARM-software/abi-aa/pull/271/files)

Patch Links:

— Reply to this email directly, view it on GitHub https://github.com/ARM-software/abi-aa/pull/271, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANY64XDRO43BYRH3HU5BWETZKF76NAVCNFSM6AAAAABKF5E77GVHI2DSMVQWIX3LMV43ASLTON2WKOZSGM4DIMRWGA3TEMQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

Wilco1 commented 3 months ago

So the question is, could existing code be using these fields in a different way (interpreting the bits as scaled by the field encoding information)? If they could, then this change would break existing usage and thus cannot be safely made, even if it would have been more useful in retrospect.

As explained, it is not possible to use a non-zero offset with REL MOVZ/MOVK/ADRP/ADD/LDR relocations since they wouldn't compute the same relocation.

smithp35 commented 3 months ago

So the question is, could existing code be using these fields in a different way (interpreting the bits as scaled by the field encoding information)? If they could, then this change would break existing usage and thus cannot be safely made, even if it would have been more useful in retrospect.

While we can never be 100% certain, we've only got one known user of SHT_REL for AArch64 which is the Arm legacy assembler armasm, and that switches to RELA for the instructions with scaled addends like ADRP, MOVT/MOVW. Part of the motivation for this change is to correct an incorrect implementation in LLD.

On the use case side, we can't be 100% certain, but I certainly can't think of a way a compiler could make use of the scaled immediates as relocation addends. I can remember from the 32-bit case that we first implemented the MOVT/MOVW in SHT_REL using scaled addends and soon ran into runtime failures which provoked the redefinition to not scale the addends. It may be possible to come up with something in assembly that isn't contrived.

I think this is an area where I'd be comfortable taking the risk.

rearnsha commented 3 months ago

I'm sure there are use cases where the bottom 16 bits might be known to be zero, then you're more likely to want something that's a multiple of 2^16.

On Tue, 2 Jul 2024 at 14:28, Peter Smith @.***> wrote:

So the question is, could existing code be using these fields in a different way (interpreting the bits as scaled by the field encoding information)? If they could, then this change would break existing usage and thus cannot be safely made, even if it would have been more useful in retrospect.

While we can never be 100% certain, we've only got one known user of SHT_REL for AArch64 which is the Arm legacy assembler armasm, and that switches to RELA for the instructions with scaled addends like ADRP, MOVT/MOVW. Part of the motivation for this change is to correct an incorrect implementation in LLD.

On the use case side, we can't be 100% certain, but I certainly can't think of a way a compiler could make use of the scaled immediates as relocation addends. I can remember from the 32-bit case that we first implemented the MOVT/MOVW in SHT_REL using scaled addends and soon ran into runtime failures which provoked the redefinition to not scale the addends. It may be possible to come up with something in assembly that isn't contrived.

I think this is an area where I'd be comfortable taking the risk.

— Reply to this email directly, view it on GitHub https://github.com/ARM-software/abi-aa/pull/271#issuecomment-2203164260, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANY64XF63ERIOVGV6VUBPXTZKKTJFAVCNFSM6AAAAABKF5E77GVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMBTGE3DIMRWGA . You are receiving this because you commented.Message ID: @.***>

smithp35 commented 3 months ago

It would have to be an expression of the form symbol + <immediate>* 2 ^ 16 Where the symbol is aligned on a2 ^ 16 boundary. The only thing that I can think of in that case is where symbol is the section symbol of a section aligned to at least 2 ^ 16, and the section is larger than 2 ^ 16 bytes.

Given how rare that this use-case is compared to sections that are 4/8 byte aligned I doubt that a potential SHT_REL only toolchain would implement section symbol + offset like that. I'd expect it to use local anonymous symbols as anchors.

smithp35 commented 2 months ago

Will be merging this PR by end of UK working day 08/07/2024

smithp35 commented 2 months ago

End of UK working day. Merging this PR.