Open a74nh opened 2 weeks ago
.NET issue that describes SVE support: https://github.com/dotnet/runtime/issues/93095
I agree that the sentence can be clarified.
Assuming the confusion hasn't been resolved already, there's a couple of other parts that may help parsing the text: https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst#22terms-and-abbreviations
Routine, subroutine
A fragment of program to which control can be transferred that, on completing its task, returns control to its caller at an instruction following the call. Routine is used for clarity where there are nested calls: a routine is the caller and a subroutine is the callee.
https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst#3scope
Obligations on the called routine to preserve the program state of the caller across the call.
Combining this with the original text, the it
is referring to the callee
.
Combining this with the original text, the
it
is referring to thecallee
.
Thanks! That was the conclusion we came to after reading around the issue elsewhere. But it would be nice for it to be clearer.
https://github.com/ARM-software/abi-aa/pull/267 to update wording.
Just to be clear, here is my understanding. @rsandifo-arm @smithp35 - please correct if I missed anything.
A()
{
prolog:
save callee-save registers
...
...
save caller-save registers
B();
restore caller-save registers
...
...
epilog:
restore callee-save registers
}
A
to B
, read it as method A
calls method B
A
: callee-save of method A
B
: caller-save by method A
Scenario# | A to B |
prolog/epilog of A |
before/after call to B |
---|---|---|---|
1 | regular to regular | bottom 64-bits v8-v15 1 | v0-v7, v16-v31, top 64-bits v8-v15 1 |
2 | regular to sve | bottom 64-bits v8-v15 1 | z0-z7, z24-z31 |
3 | sve to regular | z8-z23 | v0-v7, v16-v31, top 64-bits v8-v15 1 |
4 | sve to sve | z8-z23 | z0-z7, z24-z31 |
1 : This is same specification we have for NEON and only applicable when registers are in use or live
Scenario# | A to B |
prolog/epilog of A |
before/after call to B |
---|---|---|---|
1 | regular to regular | NA | p0-p15 |
2 | regular to sve | NA | p0-p3 |
3 | sve to regular | p4-p15 | p0-p15 |
4 | sve to sve | p4-p15 | p0-p3 |
I'm going to use the official terminology of caller-save
instead of callee_trash
.
Just to be sure, apologies if this was already clear, caller-save
and callee-save
are more like responsibilities to save than they are requirements to save. For example a callee
only needs to save a callee-save
register if it uses the register. A caller
only needs to save a caller-save
register before a call if there is a live value in the register that the caller needs to access after the call.
This is my reading of the document. I'm not a SVE expert like @rsandifo-arm so if I've got this wrong please go with his answer/correction rather than mine. I'm more of a linker than a compiler person.
I found it easier to describe when not considering the different call scenarios as there is only a caller and a callee and the responsibilities of the caller don't change if the callee is sve
or regular
.
Function type | callee-save | caller-save |
---|---|---|
regular | bottom 64-bits of v8-v15 | v0-v7, v16-v31, top 64-bits of v8-15 |
sve | z8-z23 | v0-v7, v16-v31 (*) |
(*) z16-z23 are extensions of v16-v23 so these are both callee and caller saved.
Function type | callee-save | caller-save |
---|---|---|
regular | - | p0-p3 |
sve | p4-15 | p0-p3 |
I do hope I've got this right, if I haven't and it isn't a silly mistake then we may need more clarifications.
I found it easier to describe when not considering the different call scenarios
That's how I wanted it to be, but I wanted to be explicit about the situation. For e.g. in your table, for "regular" function type, under "caller-save", the way I interpret is if a "regular" function is a caller, what registers it need to save/restore across a function call. But that depends on what type of function it is calling. If it is a "regular" function, it needs to save/restore v0-v7, v16-v31, top 64-bits of v8-15
, but if it is a sve function, it needs to just save/restore z0-z7, z24-z31
, because the sve function (which will be callee in this case) will be responsible for preserving z8~z23
. Same goes with other combination.
Also, for "regular" function type, if it is calling "regular" function, then it should save/restore entire p0~p15
, while if it is calling "sve" function, it should preserve just p0~p3
, because p4~p15
will be preserved by the "sve" function (which is callee in this case).
Note: When I say caller should preserve across function call, I mean only the registers that are live across the call. So, in my table, out of the registers mentioned in "callee-trash" column, only the registers that are live across the call will be preserved by the caller.
I do hope I've got this right
I feel the same :)
then we may need more clarifications
Regardless of if we get this or not, I think the document needs a clear way of stating these requirements, something equivalent of how we are having this information in the table. Lot of time is being spent by multiple people in trying to interpret couple of lines of the document.
OK I see where you are coming from. The safest assumption is that what is not callee-saved by the callee must be caller-saved. That would indeed imply that p0~p15
would need saving when calling a regular function.
I'll reopen this as I think more work is needed here.
Functions without SVE types in the signature don't have to save any SVE state. If they had to, then existing function would not be legal anymore. The only things function without SVE types in the signature must worry about are:
the responsibilities of the caller don't change if the callee is sve or regular.
There is a lot of nuance here and it is easy for developers to miss considerations.
A callee x
is responsible for saving (typically in the prologue) and restoring (typically in the epilogue) the callee-save
set of its own calling convention a
A caller x
is also responsible for saving (typically before the call) and restoring (typically after the call) the caller-save
set of the calling convention b
for callee y
Thus, if conventions a
and b
match (sve x->sve y
-or- regular x->regular y
), then this is relatively simple as you only have to consider the context of the individual methods x
and y
because the callee-save
for a
is the inverse mask to the caller-save
for a
However, if conventions a
and b
do not match (sve x->regular y
-or- regular x->sve y
), then the caller-save
set becomes more interesting as the callee-save
for a
will typically not be an inverse of the caller-save
for b
. Instead, they will have a union of some registers. This means that the caller x
must also consider any registers that are disjoint.
The simplest example of this is that for a regular call
, none of P0-P15
are considered callee-save
. Thus a regular method is free to trash any and all predicate registers without consideration. However, P4-P15
are considered callee-save
for an sve call
and thus must save P4-P15
is they are used.
What this means is that for regular x->regular y
, x
is free to trash any predicate registers. If it has a predicate register that needs to remain "live" across the call to y
, it must save/restore them.
For sve x->sve y
, x
is free to trash P0-P3
, but must save and restore P4-P15
if they are used. It must only save P0-P3
across the call to y
if they need to remain live.
However, for regular x->sve y
the sets differ and x
now only has to save P0-P3
because y
must be saving/restoring P4-P15
.
It gets very interesting for sve x->regular y
however, because the regular call (y
) is free to trash any of P0-P15
. This means that not only does x
need to save the normal set of P0-P3
if it's using them and needs them to remain live across the call, it must also assume that y
will trash P4-P15
and is now responsible for saving them across the call boundary (because any prior sve caller
could itself be using them and expected x
to have saved them).
Thanks for the additional points. This has somewhat spiralled from the meaning of it
:-) in a couple of sentences. I'll discuss with my colleagues to see if there is a better way of describing this.
I have updated https://github.com/ARM-software/abi-aa/issues/266#issuecomment-2177309054 to use the terminology of "caller-save" instead of "callee-trash".
Looking at the table that you have updated I think it is best not to try and enumerate the caller-save registers and caller-save registers in the same table.
The callee-save registers are a requirement for a function to preserve the values of registers across the call, so that the values of these registers on entry to the function are the same as the values on return. This requirement is invariant of the caller, or whether there are any calls at all. This looks right in your table.
The set of caller-save registers are determined per call (a function could call both regular and sve functions). They are the registers that are not guaranteed to be preserved by the function being called (registers not in the callee-saves of the function being called).
Function Type | Callee-saves |
---|---|
regular | bottom 64-bits v8-v15 |
SVE | z8-z23, p4-p15 |
Called function type | Caller Save registers for call |
---|---|
regular | All registers not in {bottom 64-bits of v8-v15} * |
sve | All registers not in {z8-z23, p4-p15} |
I've got more registers that need to be saved when calling regular functions than your table entries for caller-save.
Hope I haven't made any mistakes, I'm hoping that we can find the right wording to improve the AAPCS over the next few weeks.
All registers not in {bottom 64-bits of v8-v15} *
I assume that includes p0-p15
(might be better to clarify)
I've edited my * comment to "In practice this means all SVE state including predicate registers". Hopefully that should cover it.
I've got more registers that need to be saved when calling regular functions than your table entries for caller-save.
Yes, I realized it and have updated https://github.com/ARM-software/abi-aa/issues/266#issuecomment-2177309054 accordingly.
In aapcs64.rst
In both cases in
it must ensure that
it is not clear whetherit
refers to the caller or the callee.Eg: if
it
is the callee then the wording should bethe subroutine must ensure that
.This wording caused issues when designing SVE support for .NET.