mgba-emu / mgba

mGBA Game Boy Advance Emulator
https://mgba.io/
Mozilla Public License 2.0
5.67k stars 777 forks source link

Writes to PalRAM that cross the HBlank boundary behaves inconsistent with hardware #1996

Open lifning opened 3 years ago

lifning commented 3 years ago

build: mGBA-Qt 0.9.6724-17253f576 os: Void Linux ppc64le cpu: POWER9 (32) @ 3.800GHz gpu: AMD ATI Radeon RX 5700

So I'm working on a demo that, among other things, copies precomputed BG palettes from ROM a number of cycles after VCount interrupt, in order to simulate some relatively simple faux-alpha and overlay/hardlight/multiply type color blending effects.

This works as intended on everything I have that can be called "a real GBA" in some sense (GBA, GBASP, GCN GBP, DSLite in GBA mode; X-ROM 512M, M3SD, and EZF3in1; open_agb_firm on a New3DSXL), but fails on the emulators I've tried so far (primarily mGBA as it's the one I usually test with, but also somewhat surprisingly higan, and somewhat unsurprisingly VBA-M) in similar ways to one another: some terrible full-screen flickering as I'm assuming either subsequent VCount/Timer interrupts are being missed, or the PPU is locking out further writes, or... well, I'm not sure what the whole possibility-space looks like.

Here is a recording of it working as intended on hardware: https://cybre.space/@lifning/105461789035311128

Here's the built ROM itself: suzanne_ve.gba.gz (If the source code would be useful for tracking down the issue, let me know and I'll give you access to the repo - it's currently private as I'm figuring out what the heck to do about licensing here it is)

Here's what it looks like as recorded by mGBA's own A/V recording feature. :warning: Photosensitivity warning! :warning: These are very blinky! bugged_no_textbox.mp4 | bugged_textbox.mp4

Here's the napkin-math that got me into this mess to begin with:

      |<40>|<-- 160px -->|<40>|<-68->|
     _ ______________________________
    ^ |hud |gameplay area|hud |hblank|
    | |<------200px----->|<----148px-@  {200*4 = 800cyc after VCount hits to start copy}
    | @--->|             |    |      |  {148*4 = 596cyc to copy blends to PalRAM,
160px |    |             |<----------@   (40*2)*4 = 320cyc of which have an addn'l waitstate}
    | @---------------456px----------@
    | @--->|   textbox   |    |      |  {456*4 = 1,824cyc to copy textbox to PalRAM,
    | |    |=============|    |      |   (240+40*2)*4 = 1,280cyc of which have addn'l waitstate
    v_|____|_____________|____| __ __|
    ^ |                              |  {copy y=0 blend into PalRAM and second palette to IWRAM
68px| | 83,776cyc of vblank for game |   before end of vblank}
    v_|______________________________|
  • on VCount interrupt (at x=0), set a timer IRQ to overflow in 800-n cycles, where n is the number of cycles it takes to handle the VCount interrupt and set the timer registers
  • on Timer interrupt (at x=200ish), start copying blend colors from ROM to PalRAM.
    • 320c of 5c copies (amortized 2c/word read from ROM + 2c/word write to PalRAM + 1 waitstate)
    • 320/5 = 64 words = 128 colors (minus overhead < 8 palette lines)
    • 276c of 4c copies (amortized 2c/word read from ROM + 2c/word write to PalRAM)
    • 276/4 = 69 words = 138 colors (minus overhead < 8.5 palette lines)

tl;dr I believe this should be enough to CPU-copy (ldmia/stmia) 200-ish colors into BG Pal with a WAITCNT allowing for 1-cycle sequential halfword reads from ROM. But it seems in emulators, when my BG palette is larger than 4 lines / 64 colors / 32 words, everything becomes that flickery nightmare.

If it's an architectural limitation of emulating the GBA at reasonable speeds, then any advice on how I could achieve this same effect in a way that doesn't make current emulators explode would also be welcome :)

Cheers

endrift commented 3 years ago

Well...there's already basic BG blending. But yeah the reason this isn't supported is because I draw the scanline all at once and I don't support changes made in the middle of the scanline. Until recently I thought these changes would be ignored anyway, but I keep finding more and more counter-examples (I also made a demo that exploited one of these counter-examples a few years ago, but never really released it). I've been planning the ability to split scanlines like I do in the GB renderer, but I haven't gotten to it yet.

Also I'm amazed anyone is running mGBA on ppc64el. I only recently got that working at all; it had some weird issues with a handful of things, and I had to ask a friend to set up a VM on her POWER7 for me to test with.

lifning commented 3 years ago

It was broken earlier this year, but it's been working ever since 0a06af1aa66ad999c2e022b86768fdb66769f58a :)

We've also got a 32-bit little-endian PPC userland now, which is a very cursed new ABI being worked on by some talented folks in the voidlinux-ppc community - happy to report that just now I compiled mGBA-SDL in the ppcle chroot and it seems to run Sonic Advance 2 just fine ;)

endrift commented 3 years ago

There were some other bugs with (rarely used) functions that I fixed later on. The biggest notable impact was that it would never properly detect the Game Boy Player logo so GBP-based rumble would never work.

benderscruffy commented 2 years ago

@lifning can you release the source code please