Closed suborb closed 3 years ago
Re banking. It looks like sdcc supports #pragma bank NN
which sets the code segment to _CODE_NN, leaves bss, data, rodata sections as usual.
Within z88dk we use BANK_NN
as the section name for the SMS target and I've copied that over to the Gameboy target. These are targeted with the appropriate pragma to change the section names. This suggests that sccz80 should support #pragma bank
as a shortcut for setting code and rodata sections.
I think placing all z88dk library code within the always paged in bank makes life easier so we'll only end up with user code being manually placed within banks.
Conventionally, GBDK used the __banked
annotation to support banked calls, this resulted in the following code being emitted:
call banked_call
defw [function address]
defw [bank]
Where banked_call
switches pages and offsets the stack, this stack offset can be handled by zsdcc and sccz80 with the annotation __z88dk_params_offset(X)
.
Function address can be populated using a regular z80asm patch expression. The bank could be populated by an appmake stage as follows:
banked_call
in the .map file: XXYY
CDYYXX [AA BB] [00 00]
where BBAA
will be the address of a function.BBAA
in the .map file, find the section name, parse and deduce the bank.This does, however feel fragile - I'd be worried about functions having clashing addresses in different banks (which will happen for functions at the start of a bank unless we fudge the first usable address within a bank).
Thus, I think we will need z80asm support for this - effectively a way of parsing the section name of symbol and allowing that to be used within source code. Thoughts, @pauloscustodio ?
Re Banking.
This matches the way that it is done on YAZ180 with the __call_far
function. Though I use call
, defw
, defb
(signed relative), for the bank call.
Using a signed bank definition allows easy relative bank calls (i.e. call a bank above, or a bank below the current bank), and zero refers to the current bank. But this is not baked in concrete. Happy to align to whatever becomes the standard way to do this.
Would it be sensible to make the addressing somewhat linear with a defq
address definition, where the lower 16 bits are addressing and the upper 16 are either bank identifier or linear address space, depending on the platform implementation?
The patching mechanism for banks looks quite like the REL format, with the bitmap attached to indicate items (call, jp, etc) to be patched.
Re Banking.
Two options below. Please comment or suggest alternatives.
1)
Extend z80asm to handle 24- or 32-bit addresses, where the lower 16-bits are the address seen by the CPU, and the upper 8- or 16-bits are the bank id (platform dependent, e.g. value to be written to the bank register). CALL
, JP
, ... need to accept the 32-bit address and ignore the upper 16-bits. For the Spectrum 128K (which I know better than the GameBoy), one could write
section xxx1 ; name is not relevant
org $00C000 ; select page 0 at address $C000
public func1
func1: ...
section xxx2
org $01C000 ; select page 1 at address $C000
extern func1
...
call banked_call ; platform-dependent function that switches banks and calls the function
defp func1 ; patch in a 24-bit address (Spectrum 128k); use defq for a 32-bit address
This solution is easy to implement in z80asm; all the infrastructure is in place, we just need to handle the 32-bit addresses gracefully.
2) Let the linker automatically resolve banked calls.
Same as above, use the upper 16-bit of a 32-bit address as the bank id. Create new opcodes for banked calls that reserve space in the object code for the call to the banked_call
function (platform dependent function, part of library) and the 24- or 32-bit address (call24
, call32
).
At link time, include either the call to banked_call
and a defp
/defq
with the called address, or, if the target is in the same bank, a regular call followed by 3 or 4 nop
.
section xxx1 ; name is not relevant
org $00C000 ; select page 0 at address $C000
public func1
func1: ...
section xxx2
org $01C000 ; select page 1 at address $C000
extern func1
func2: ...
...
call24 func1 ; assembled & linked as: call banked_call : defp func1
call24 func2 ; assembled & linked as: call func2 : nop : nop : nop
This solution is a bit more complex, but gives additional flexibility in arranging the code in banks.
I quite like option 1 since it offers the compiler (or assembler author) more control as to how to invoke a function - I do suspect there may well be many ways of actually invoking the trampoline (for example via a rst
is going to be an obvious option) so the pseudo op-code doesn't feel quite right.
The automatic conversion in this case:
call24 func2 ; assembled & linked as: call func2 : nop : nop : nop
would be incorrect if func2 was a C function with parameters (the parameters wouldn't be at the expected stack offset).
Option 1 implemented in z80asm_32bit_addresses branch. Please let me know of any issues,
Thank you Paulo, I'll give it a try this evening I hope.
The z80asm changes works for my requirements - I've successfully had a banked call execute and return a value.
I don't think losing the range checking is too much bother so please feel free to merge.
It looks like sdcc supports #pragma bank NN which sets the code segment to _CODE_NN
That’s correct -bo
also generates _CODE_N
and -ba
generates _DATA_N
(SRAM banks)
Inside the linker _CODE_N
becomes 0xN4000 and _DATA_N
becomes 0xNA000, which get mapped to the real rom addresses when the IHX gets created.
Adding bank and address after the call became --legacy-banking
in SDCC trunk.
EDIT:
Fixup return values to match SDCC abi
What does that mean? gbdk-n followed the return in e, de, hlde SDCC uses on gbz80
Fixup return values to match SDCC abi
What does that mean? gbdk-n followed the return in e, de, hlde SDCC uses on gbz80
The z88dk libraries and the z80 targets use l, hl, dehl so for the libraries to work with zsdcc (gbz80) 8/16 bit functions have the return value in de and hl.
Does that also mean that you have __z88dk_fastcall
on gbz80 now?
We haven’t changed anything in sdcc regarding gbz80. However sccz80 supports fastcall and callee functions for gbz80
Well, if __z88dk_fastcall
was to be implemented in sdcc, it should be compatible enough to call function compiled with z88dk.
I was nearly done with implementing __z88dk_fastcall
for gbz80 so that parameters are put into the registers used for return values. But I’ll try to implement a different kind of fastcall then, l for 8bit parameters and return values is quite inconvinient if it’s paired with bigger functions, which have variables on the stack.
Well, back to this.
Last week I finally understood that z88dk has it's own compiler, assembler etc. besides it's pached sdcc.
Loading values into four registers for a 8bit return value is indeed quite undesirable.
I suspect __smallc
is solely for SCCZ80 compatibility?
I could try to implement __smallc
with return registers hl, hlde how SCCZ80 generates them. (Even though __smallc
gets accepted, it does not push 8b as 16b currently)
And __z88dk_fastcall
using return registers as parameters:
__z88dk_fastcall
using e, de, hlde__z88dk_fastcall __smallc
using hl, dehlI'm all over the place at the moment, working on far too many things at the same time, so I've not had a chance to fix the other issue - apologies.
Yes, __smallc
is purely for sccz80 compatibility (likewise sccz80 has __z88dk_sdccdecl
for sdcc compatibility).
There's two cases to consider for mixing/matching. Libraries and user code. Although it's theoretically possible to mix-and-match compilers for user-code in classic it's not particularly well tested and there are caveats (there's a wiki page somewhere but I can't find it at the moment)
Getting library interop working is important though, to get sdcc to work together with the libraries I had to make the following modifications:
As a explanatory note, for library routines, sdcc enters via the labels _strlen
and sccz80 via strlen
which does allow the library to handle the register requirements, but fixing everything up is extremely tedious obviously so the fewer times we need to do that the better.
My feeling is that the priority order is:
Fixing up __smallc
with return registers in hl,dehl
- that will allow the library routines that return a long value to work correctly with sdcc without needing any workarounds and allows the long functions to work.
The input registers for __z88dk_fastcall
. For library work (which is in asm) the input registers could be worked around with an ex de, hl
equivalent which isn't particularly onerous/expensives
The promotion from char to int on library functions is suboptimal but is an easy workaround so comes last
I'm all over the place at the moment, working on far too many things at the same time, so I've not had a chance to fix the other issue - apologies.
No problem, it's not urgent, I was just playing around a bit with it and noticed that stuff.
The libraries never take a char parameter - everything got promoted up to an int
Is that a general rule for sdcc or for gbz80 specifically?
3 is part of 1, sdcc user guide explicitly says that __smallc
is left to right and that 1 byte arguments are passed as 2 bytes, with the value in the lower byte. Though it does not say anything about the return value. Do they never return <2B?
Do I have to care about the upper byte of chars or can I just push trash into them?
The equivalient to ex de, hl
is
ld a, d
ld d, h
ld h, a
ld a, e
ld e, l
ld l, a
which are 6 bytes and 24 cycles wasted. Or if you go for size and do
push de
push hl
pop de
pop hl
it would be 4 bytes and 54 cycles wasted. Fetching a 16b argument from stack is just 5b and 32c: (+1b 16c for the push)
ldhl sp, #2
ld a, (hl+)
ld h, (hl)
ld l, a
My main interest is indeed to have a e,de,hlde __z88dk_fastcall
, but I also want z88dk to work somehow.
And this is probably a bug https://github.com/z88dk/z88dk/blob/bd1442a514df74e946d87227d90ac8d0d5616dfa/libsrc/target/gb/gbdk/mode.asm#L26-L31
The libraries never take a char parameter - everything got promoted up to an int
Is that a general rule for sdcc or for gbz80 specifically?
It's not desirable but came out of necessity for the interop - the z80 1b pushing was only fixed a couple of years ago.
| 3 is part of 1, sdcc user guide explicitly says that __smallc is left to right and that 1 byte arguments are passed as 2 bytes, with the value in the lower byte. Though it does not say anything about the return value. Do they never return <2B?
I suspect that everything in the libraries is rounded up to be 2b return value (of which only 1b is of significance).
| The equivalient to ex de, hl is...
Yes, it's not pretty, for single parameter entry it's just 3 bytes since we just need to do ld hl, de
. I was just mentioning it since the solution can be staged and can get value without having to do everything all at once.
| And this is probably a bug
Oh yes, thank you.
It looks like __smallc
already succesfully pushes 1B values as 2B, I just did not recognize them.
It does
dec sp
ld a, #0x04
push af
inc sp
which is a weird way of doing
ld e, #0x04
push de
So it really only needs different return registers.
Implementing fastcall would be trivial, the return would probably only need to care about __smallc
.
It's not designed for changing registers of calling conventions dynamically, but I can maybe treat them as completely new calling conventions (smallc_return, smalc_fastcall).
I've done as much as I'm ever going to do. The remainder isn't important so closing.
The following need to be completed for a full target:
joystick()
maps into GBDK joypad codeCRT_FONT
ioctl()
The following should be done: