Open hikari-no-yume opened 1 year ago
A little update to this is that we moved the frame pointer from the zero page to the return stack. It sits on top of the return address.
Old | New | |
---|---|---|
Initialization | \|0000 @rbp $2 |
LIT2r 0000 |
Fetch | .rbp LDZ2 |
STH2rk |
Start of function: | .rbp LDZ2 #0004 SUB2 .rbp STZ2 |
OVR2r LIT2r 0004 SUB2r |
End of function: | .rbp LDZ2 #0004 ADD2 .rbp STZ2 JMP2r |
POP2r JMP2r |
So this conflicts a little with our other plan of using the top of the return stack as a register and STH2rk
to fetch it. But still, having two "register" shorts on top of the return stack, one of which is the frame pointer, is reasonable.
A little cost comparison of operations (if we optimize well)…
Frame pointer | Register | Local | |
---|---|---|---|
Get | STH2rk |
OVR2r STH2r |
STH2rk #offset ADD2 LDA2 |
Set | POP2r STH2 |
NIP2r STH2 SWP2r |
STH2rk #offset ADD2 STA2 |
Increment | INC2r |
SWP2r INC2r SWP2r |
STH2rk #offset ADD2 LDA2k INC2 SWP2 STA2 |
I guess a second short on the return stack would only make sense for loop variables. If we can optimize really well, maybe it also works for other things where all the arithmetic can be done directly on the return stack.
Register | Register 2 | Local | |
---|---|---|---|
Get | OVR2r STH2r |
ROT2kr NIP2r NIP2r STH2r |
STH2rk #offset ADD2 LDA2 |
Set | NIP2r STH2 SWP2r |
ROT2r POP2r STH2r ROT2r ROT2r |
STH2rk #offset ADD2 STA2 |
Increment | SWP2r INC2r SWP2r |
ROT2r INC2r ROT2r ROT2r |
STH2rk #offset ADD2 LDA2k INC2 SWP2 STA2 |
Oh, if you have one register
variable then another one of your non-register
variables becomes cheaper: instead of STHR2rk #0000 ADD2 LDA2
, it's just STH2rk LDA2
for that local. So a function like int foo(register int a, int b)
could be quite efficient.
Currently all local variables and function arguments get written to and read from an in-memory stack (
@rbp
). This makes for pretty ugly and inefficient assembly. In many cases such variables never have their address taken, so it would be possible to skip this. Any modern compiler would optimise this away with a series of passes that do alias analysis etc, but we want to keep this compiler simple and single-pass-ish, so let's do things the old-fashioned way: implement theregister
keyword!I was thinking of putting
register
variables on the uxn working stack, but after hearing my concerns about how many operations might be needed to access something buried deep in the stack by temporaries, lynn pointed out you can also use the uxn return stack to store data, so that might be worth trying instead. :3For function arguments annotated with
register
, it would be nice to have this work together with https://github.com/lynn/chibicc/issues/12 so they don't need copying within the callee at all.