Closed asiekierka closed 1 year ago
I think this is a place where a liberal application of macro and linker shenanigans may help. For example, if the bank were statically known: SET_CHR_BANK(7) could expand to __set_chr_bank_7()
, which could be implemented as 1: lda #7, sta 1b-1
using inline assembly. You could also include up to 256 versions of a set_char_bank_<limit>(uint8_t n)
that used a table of size limit
, then use a linker script to alias set_chr_bank(n)
to the correct version based on the size symbols. Options along these lines would need further experimentation; this is just me spit-balling, so I don't know whether or not these would actually work.
How about this: one would define, in the nes
target itself, a helper function __rom_write_safe(uint8_t value);
, which is then aliased using the linker script to __rom_write_safe_<limit>
with its corresponding table, and the table is placed in a dedicated section so that the linker can place it in the right area of the ROM's address space. It's essentially the second proposal - I like it, it seems reasonable and workable for all targets; and __rom_write_safe
can then be aliased to a __rom_write_unsafe
without a table for a mapper with no bus conflicts (as some, like UNROM-512, have both conflicting and non-bus-conflicting variants) to avoid the table entirely.
The only issue I have with this approach is that I'm not sure how it will interact with LTO - from my UNROM-512 experiments, it is rather beneficial for the optimizer to be able to reason about all the ROM writing down to the actual memory pokes.
Huh, color me surprised that all of the mappers on the wiki with bus conflicts have just one mutable ROM register; guess it comes with the territory. So yeah, having a common implementation in nes.ld
would work on that front.
I did a quick walk of the linker code; it's clever enough to mark symbols used in the linker script as retained, and it may be technically possible for it to forward those assignments over to the LTO link, (the linker script is fully-parsed and all symbols resolved before LTO occurs), but right now it doesn't.
So, at least for now, if we wanted to avoid the function call penalty, we'd have to have different linker scripts to include for different sizes; these could INPUT() a different file into the LTO link with a different table definition. That's a bit worse, and I've generally tried to avoid it, but IIRC MMC3 is already like this, so it's not horrific.
The other thing we could do is decrease the JSR penalty; small routines like this really should be marked with a calling convention that preserves everything except AXY, llvm-mos/llvm-mos#229. In that case, the overhead should be relatively minimal.
A more general option would be to get interprocedural register allocation working, but that's a much, much bigger project, since those code paths are not very well maintained in LLVM.
Ah, a better solution here might be to do the linker script redirection in terms of the ROM identity array rather than the functions that use it; there's nothing that the code generator would be able to do with the contents of the array, since they don't matter from a code generation standpoint; their values just control which values the bus is driven to.
Unfortunately, linker script redirection (of the symbol = condition ? symbol1 : symbol2;
type) won't work - the linker won't GC any symbol involved in such an equation.
But I have another (much worse) idea...
Some mappers supported by llvm-mos (CNROM, some variants of UNROM, some variants of UNROM-512) feature bus conflicts:
set_chr_bank
routine, instead expecting the user to provide their own with the following pattern for each bank used:$FF
ROM location every time, which often does - but is not guaranteed to - lead to writing the correct value.The correct solution is to find a way to make the CNROM mapper's solution more generally applicable - but can this be done without a 256-byte table forced on every such mapper?
Ideas/suggestions welcome!