llvm-mos / llvm-mos-sdk

SDK for developing with the llvm-mos compiler
https://www.llvm-mos.org
Other
266 stars 53 forks source link

Add safe bus conflict handling. #194

Closed asiekierka closed 11 months ago

asiekierka commented 12 months ago

Fixes #192, or "How I Learned To Stop Worrying And Love The Linker".

Also adds SDK-side get_chr_bank/set_chr_bank helpers to CNROM, now that this is actually possible to do safely; as well as an UNROM 512 macro to enable a "self-flashing" board, now that this is no longer the default.

(Other INES macros will be added via #193)

I'm giving this one a longer write-up because I think the level of hacks in the PR necessitates an explanation.

The Problem

The NES features mapper bus conflicts. Simpler mappers, based around discrete logic chips, do not prevent the ROM chip from being enabled when the CPU is writing to the cartridge's data bus - this creates a conflict, as both the CPU and ROM are outputting values on the same bus at the same time.

The correct way to work around this is to ensure that the value output by the CPU is the same as outputting the ROM. This is typically done by creating a byte array A where, for any given value N, A[N] == N; then, whenever one wants to write N to ROM, it is done to the address of A[N].

However, this byte array would normally take up 256 bytes of the 16384 bytes typically present in the "fixed" ROM - this level of wastefulness is unacceptable. Many mappers only require a far smaller table; for example, an UNROM cartridge typically only has 16 banks, so the only values one would need to write to the ROM bus will be between 0 and 15. We'd like the LLVM-MOS SDK to be able to dynamically allocate, at link time, the right-sized array - of M bytes - based on the INES header configuration (the bank sizes, etc).

The Answer

Naturally, it'd be nice to have an unified solution for all the affected mappers. By sheer coincidence, all the discrete mappers with bus conflicts which llvm-mos-sdk wishes to target have just one writable memory-mapped port, which is always reachable from the "fixed" ROM space. Nice! As such, a rompoke.h header is defined with the function void rom_poke_safe(char value);, which performs the A[N] == N operation.

Next, we need to create the dynamically allocated table.

The first idea was to generate one table - with a distinct section and symbol - per table size (from 0 to 256 inclusive - mappers with bus conflicts often have non-conflicting variants, which we can then avoid paying the table's size cost on completely), like so:

__rom_poke_table = big... ? __rom_poke_table_256 : (less_big ? __rom_poke_table_128 : ...)

Unfortunately, LLD will retain in ROM all symbols used in such an equation - which is worse than just having one table in the first place.

The Terrible, Terrible Answer

Something we can dynamically control, though, is section sizes; we can also control the byte value used to fill such a section.

So, technically speaking, one could define 256 sections - one for each byte of A - and simply set the first M sections to a size of 1 byte, with the remaining 256 - M sections set to a size of 0 bytes.

But who would ever do such a thing? Anyhow, here's the pull request.

The Less Terrible But Actually No Good Answer

That's when I found that GNU's linker, which LLD claims drop-in compatibility with, states the following in its manual:

If the fill expression is a simple hex number, ie. a string of hex digit starting with ‘0x’ and without a trailing ‘k’ or ‘M’, then an arbitrarily long sequence of hex digits can be used to specify the fill pattern; Leading zeros become part of the pattern too.

This isn't that well-known - older versions of ld were limited to just two- or four-byte fill patterns.

Either way, that's it! I can have just one section which holds an arbitrary number of bytes based on a fill pattern, then set the section's size using . = . + __rom_poke_table_size;! This is great! It will certainly-

ld.lld: error: [...]/mos-platform/nes/lib/rompoke-table.ld:4: malformed number: 0x000102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f202122232425262728292a2b2c2d2e2f303132333435363738393a3b3c3d3e3f404142434445464748494a4b4c4d4e4f505152535455565758595a5b5c5d5e5f606162636465666768696a6b6c6d6e6f707172737475767778797a7b7c7d7e7f808182838485868788898a8b8c8d8e8f909192939495969798999a9b9c9d9e9fa0a1a2a3a4a5a6a7a8a9aaabacadaeafb0b1b2b3b4b5b6b7b8b9babbbcbdbebfc0c1c2c3c4c5c6c7c8c9cacbcccdcecfd0d1d2d3d4d5d6d7d8d9dadbdcdddedfe0e1e2e3e4e5e6e7e8e9eaebecedeeeff0f1f2f3f4f5f6f7f8f9fafbfcfdfeff

... If this works on the GNU linker (I checked with binutils 2.39), does this technically constitute an LLVM bug?

The Summary

The PR is a terrible hack, but it does get the job done and it seemingly passes all the tests. Please be merciful.