Closed jnk0le closed 1 year ago
you can implement whatever you like in your microarchitecture, providing it is interrupt safe.
Checked cortex m0 and its pop {pc}
doesn't modify the lr.
Reading the ARMv7m doc:
Pop Multiple Registers loads a subset (or possibly all) of the general-purpose registers R0-R12 and the PC or the LR from the stack
If the PC is specified in the register list, the instruction causes a branch to the address (data) loaded into the PC
I have the impression that it performs loads as explicitly stated in instruction: load to LR
OR to PC
not both. (guess that could be due to legacy baggage) Hence it's equivalent to:
content of ra prior to executing popret[z]
Back to riscv:
you can implement whatever you like in your microarchitecture, providing it is interrupt safe.
The instruction itself states the ra is loaded: cm.popretz {ra,s0-s11}, 96
2.4.2 explicitly divides into load of registers and ret
instruction
For POPRET once the stack pointer adjustment has been committed the ret must execute
Appears to software as: [...] lw ra, 12(sp) [...] ret
Destroy stack frame: load ra and 0 to 12 saved registers from the stack frame, deallocate the stack frame, (move zero into a0,) return to ra.
This instruction pops (loads) the registers in reg_list from stack memory, adjusts the stack pointer by stack_adj, moves zero into a0 and then returns to ra
2.9 and 2.10 also map the reg_list
to xreg_list
(interpretable as "the actual registers") which contains x1
aka ra
case 8: {reg_list="ra, s0-s3"; xreg_list="x1, x8-x9, x18-x19";}
2.9 and 2.10 pseudocode similarly to the one in 2.4.2 also do 'lwto
raand separate
ret` but in this case there is none of the "example" or some kind of "appears to be" words in those sections.
Well, I guess it actually could be interpreted so loosely:
ra
previously"But, will it be considered Zcmp compliant still? How about the verification stuff like golden models, compliance/regression tests etc.? (those will probably implement a straightforward interpretation) There is also dangerous precedence in implementing things, basing on this kind of spec interpretations.
Also the "abuse of observed behaviour" could happen in both scenarios:
popret
(assembly to assembly only)lui
/auipc
skip to do e.g. ARM like pcrel loads of constants using the ra after a function callI think that explicitly stating if uarch is allowed to implement any of those approaches or "must load to ra
", is a backwards compatible change (wrt. hardware designs) and could be easily handled by puplic review process.
popret[z]
replaces always loads to rapop
circuitBut I can also imagine the cores that might want to finish
popret[z]
a cycle earlier and e.g. somehow load stacked ra directly to pc in parallel withaddi
.There is also possibility of abuse of observed behaviour, so I think it needs to be explicitly stated what can happen to ra after
popret[z]
.ra
prior to executingpopret[z]
(probably not)