Closed ProxyPlayerHD closed 9 months ago
The compiler only one parameter in register. XC
can be used and it is useful to have at least one register free (Y in this case) as a scratch register when setting up the stack frame.
During early development I allowed two register parameters in X and D, but I ran into some nasty corner cases and ended up going for a single parameter.
In the example you show X
is not a parameter, it is used as a place to load the constant to put into _Dp
which is the parameter.
That the third parameter is passed on the stack and it can be argued that it is a bug, logically it should probably have used _Dp+4
for it. On the other hand, I think it is intentional. The stack is decent on the 65816 and a 16-bit value in most data models is an integer. I prefer to get pointers into the _Dp
registers and it will always try to stuff two pointer parameters there, but it will leave scratchable _Dp+4
free for the called function (not put an integer value there).
If you wonder why it does not use _Dp+2
and _Dp+6
that is intentional. The register model I use regard _Dp+0
- _Dp+3
as a 32 bit register, it will always put a 16 bit value in the lower part, and same goes for a 24 bit value. I want to ensure that conversions (casts) between different sized _Dp
registers does not need to move the value around (which would require a CPU register). It is to make the register model simpler, I prefer it that way.
I plan to introduce an alternative simpler register model which will probably keep one CPU register and push the rest on the stack, but I have not get around to it.
I hope this clarifies some of the confusion you got from it, otherwise ask more.
The LDA (1,S),Y
should work, can you share a snippet of using that, so I can see exactly how you do it? Feel free to open another ticket for it. I use that addressing mode in the compiler and the assembler has test example with it.
alright, so: first parameter goes into A (or A+X in case it's 32-bit) if it fits, second into _Dp, and everything beyond that onto the Stack. did i get that right?
I plan to introduce an alternative simpler register model which will probably keep one CPU register and push the rest on the stack, but I have not get around to it.
that would be nice actually, since then you'd avoid the constant loading from deeper in the stack to put on the _Dp before a function call. plus you'd probably be able to make good use of the PEI instruction to directly put things onto the stack without using any registers.
The LDA (1,S),Y should work, can you share a snippet of using that, so I can see exactly how you do it?
it's written exactly like that in my code, i'll post the whole file so maybe you can try to recreate the issue.
the command i use is simple:
AS65816 --code-model large --data-model large --list-file Output\setPixel.lst -o Temp\setPixel.o setPixel.s
; ---------------------------------------------------------------------------
; setPixel.s
; ---------------------------------------------------------------------------
;
; Draws a Pixel at the specified coordinates with "color"
;
.rtmodel version,"1"
.rtmodel codeModel,"large"
.rtmodel dataModel,"large"
.rtmodel core,"65816"
.rtmodel huge,"0"
.extern _Dp
; ---------------------------------------------------------------------------
; void setPixel(uint16_t PX, uint16_t PY, uint8_t color);
; PX Coordinate: A
; PY Coordinate: _Dp[0-1]
; Color: Stack+4
; _Dp+0 - PY (low)
; _Dp+1 - PY (high)
; _Dp+2 - temp byte
PY .equ _Dp
tmp .equ _Dp+2
.section farcode, text
.public setPixel
setPixel:
TAY ; Save PX
AND ##0x0007
TAX ; Save the bottom 3 bits of PX
TYA ; Get the original PX back
LSR A
LSR A
LSR A
PHA ; Put upper 13 bits of PX onto the Stack
LDA dp:.tiny(PY)
ASL A
ASL A
ASL A
ASL A
CLC
ADC 1,S ; (PX >> 3) + (PY * 16)
STA 1,S
LDA dp:.tiny(PY)
ASL A
ASL A
ASL A
ASL A
ASL A
ASL A
CLC
ADC 1,S ; (PX >> 3) + (PY * 16) + (PY * 64)
TAY ; Save the Calculated Address into Y
PLA ; Pull the Incomplete Address from the Stack
PHB ; Save the Data Bank
PEA #0x8000 ; Push the bottom 16-bits of the VRAM Base Address
PEA #0xFFFF ; Push the upper 16-bits of the VRAM Base Address
PLB
PLB ; Pull Data Bank twice
LDA ##0x0000 ; Clear A
SEP #0b00100000 ; 8-bit Accumulator
LDA (1,S),Y ; Get a Byte from the calculated Address
STA dp:.tiny(tmp) ; And store it for now
LDA .word0 _bitmask,X ; Get a bitmask from the Table corresponding to the selected bit
BEQ 1$ ; If Color != 0
ORA dp:.tiny(tmp) ; Apply the Mask to the read Byte
STA (1,S),Y ; And put it back into VRAM
BRA 2$
1$: ; If Color == 0
EOR #0xFF ; Invert the Mask
AND dp:.tiny(tmp) ; Apply the Inverted Mask to the Byte
STA (1,S),Y ; And put it back into VRAM
2$: REP #0b00100000 ; 16-bit Accumulator
PLA ; Remove the Base Address from the Stack
PLB ; Restore the Data Bank
RTL ; And Return
.section cfar, rodata
.pubweak _bitmask
_bitmask:
.byte 0b10000000, 0b01000000, 0b00100000, 0b00010000, 0b00001000, 0b00000100, 0b00000010, 0b00000001
alright, so: first parameter goes into A (or A+X in case it's 32-bit) if it fits, second into _Dp, and everything beyond that onto the Stack. did i get that right?
That is correct for 16-bit integer types, in all data models except the small one. Pointer types and 32-bit types will make use of _Dp+4
to _Dp+7
.
I put the simplified calling convention up for the next major release, I have to see if I can get it in there, the current calling convention is too complicated on the 65816 for assembly use.
it's written exactly like that in my code, i'll post the whole file so maybe you can try to recreate the issue. the command i use is simple:
AS65816 --code-model large --data-model large --list-file Output\setPixel.lst -o Temp\setPixel.o setPixel.s
This is a bug. The s
is actually not parsed as a register but a literal internally and it incorrectly does not make it case insensitive. Make s
lower case as a work around for now.
Version 5.1 provides a simplified calling convention which can be applied to functions. Use the __simple_call
keyword or the __attribute__((simple_call))
attribute.
The bug with S
register is also fixed in 5.1.
I will close this as fixed.
i'm planning on making a graphics library since i recently made a VGA Card for my 65816 SBC. most functions will be in assembly for increased speed and reduced code size. so i'm trying to better understand the calling convention to know how to write the fuctions
the User Guide says:
which is pretty clear so far, but also raises the question why isn't the Y Index Register used for parameter passing as well? i've noticed that the compiler rarely seems to use the Y Register in general.
anyways, so for my setPixel function, which takes 3 parameters. 16-bit "X" and "Y" Coordiates, and an 8-bit "Color" in that order. so i assumed the parameters would be assigned like this:
X Coordinate -> Accumulator Y Coordinate -> X Index Register Color -> _Dp[0-1]
but when making a dummy function in C, calling it, and then looking at the generated ´.lst´ file, it's slightly different.
X Coordinate -> Accumulator Y Coordinate -> X Index Register (and _Dp[0-1] for some reason) Color -> Stack (as a 16-bit word)
the exact code it generated is here:
this is confusing because the Y Coordiate is stored in both X and _Dp, and why would the 3rd parameter be stored to the stack when _Dp[2-3], _Dp[4-5] are all still available. (and i assume also _Dp[6-7] even though the user guide doesn't list that pseudo register, but higher up it says "_Dp[0-7]" are used as pseudo register)
overall i'd just like some clarification for how the parameter passing works exactly, cause this is just breaking my brain
on a side note, why is
LDA (1,S),Y
causing aninvalid operand field
error? it's a valid addressing mode for the 65816 that's formatted exactly like that.