lynn / chibicc

A small C compiler… for uxn
MIT License
114 stars 4 forks source link

Implement the `register` keyword #13

Open hikari-no-yume opened 1 year ago

hikari-no-yume commented 1 year ago

Currently all local variables and function arguments get written to and read from an in-memory stack (@rbp). This makes for pretty ugly and inefficient assembly. In many cases such variables never have their address taken, so it would be possible to skip this. Any modern compiler would optimise this away with a series of passes that do alias analysis etc, but we want to keep this compiler simple and single-pass-ish, so let's do things the old-fashioned way: implement the register keyword!

I was thinking of putting register variables on the uxn working stack, but after hearing my concerns about how many operations might be needed to access something buried deep in the stack by temporaries, lynn pointed out you can also use the uxn return stack to store data, so that might be worth trying instead. :3

For function arguments annotated with register, it would be nice to have this work together with https://github.com/lynn/chibicc/issues/12 so they don't need copying within the callee at all.

lynn commented 1 year ago

A little update to this is that we moved the frame pointer from the zero page to the return stack. It sits on top of the return address.

Old New
Initialization \|0000 @rbp $2 LIT2r 0000
Fetch .rbp LDZ2 STH2rk
Start of function: .rbp LDZ2 #0004 SUB2 .rbp STZ2 OVR2r LIT2r 0004 SUB2r
End of function: .rbp LDZ2 #0004 ADD2 .rbp STZ2 JMP2r POP2r JMP2r

So this conflicts a little with our other plan of using the top of the return stack as a register and STH2rk to fetch it. But still, having two "register" shorts on top of the return stack, one of which is the frame pointer, is reasonable.

A little cost comparison of operations (if we optimize well)…

Frame pointer Register Local
Get STH2rk OVR2r STH2r STH2rk #offset ADD2 LDA2
Set POP2r STH2 NIP2r STH2 SWP2r STH2rk #offset ADD2 STA2
Increment INC2r SWP2r INC2r SWP2r STH2rk #offset ADD2 LDA2k INC2 SWP2 STA2
hikari-no-yume commented 1 year ago

I guess a second short on the return stack would only make sense for loop variables. If we can optimize really well, maybe it also works for other things where all the arithmetic can be done directly on the return stack.

Register Register 2 Local
Get OVR2r STH2r ROT2kr NIP2r NIP2r STH2r STH2rk #offset ADD2 LDA2
Set NIP2r STH2 SWP2r ROT2r POP2r STH2r ROT2r ROT2r STH2rk #offset ADD2 STA2
Increment SWP2r INC2r SWP2r ROT2r INC2r ROT2r ROT2r STH2rk #offset ADD2 LDA2k INC2 SWP2 STA2
hikari-no-yume commented 1 year ago

Oh, if you have one register variable then another one of your non-register variables becomes cheaper: instead of STHR2rk #0000 ADD2 LDA2, it's just STH2rk LDA2 for that local. So a function like int foo(register int a, int b) could be quite efficient.