Closed DaveTCode closed 2 years ago
IRQ line going high IE=VBL,HBL,VCount IF=VBL
IRQ line going high IE=VBL,HBL,VCount IF=HBL
IRQ line going high IE=VBL,HBL,VCount IF=HBL
IRQ line going high IE=VBL,HBL,VCount IF=HBL
IRQ line going high IE=VBL,HBL,VCount IF=HBL
IRQ line going high IE=VBL,HBL,VCount IF=HBL
IRQ line going high IE=VBL,HBL,VCount IF=HBL
IRQ line going high IE=VBL,HBL,VCount IF=HBL
IRQ line going high IE=VBL,HBL,VCount IF=HBL
IRQ line going high IE=VBL,HBL,VCount IF=HBL
IRQ line going high IE=VBL,HBL,VCount IF=HBL
IRQ line going high IE=VBL,HBL,VCount IF=HBL
IRQ line going high IE=VBL,HBL,VCount IF=HBL
IRQ line going high IE=VBL,HBL,VCount IF=HBL
IRQ line going high IE=VBL,HBL,VCount IF=HBL
That's interrupt logs from the section where the failure happens.
VBL,HBL and Vcount IRQs are all enabled and the specific interrupt which is "last" is just a HBL one same as all the others so is presumably going through much the same code.
We can work out the exact scanline this is happening on by just counting things. There were 14 HBL interrupts after the VBL one. VBLank triggers on line 160 and hblank happens later on the same line. There are 68 vbank lines so this is just "somewhere in VBlanking" not line 0 or anything interesting like that
Notably disabling hblank IRQ firing (by commenting out the relevant ppu line) makes this issue go away although the ship goes unbearably slowly.
Next question is - "what is hlank DMA actually doing". Might need to do some disassembling to find out!
Asking the question "when is R14 set to the RAM value that causes the loop" gives me:
r0:6000003F r1:00000001 r2:00012007 r3:04000200
r4:03004410 r5:03000070 r6:00000009 r7:00000000
r8:03000028 r9:00000000 r10:00000000 r11:00000000
r12:0805DCFA r13:03007F88 r14:03000904 r15:00000018
cpsr: 00000092 -----I- Arm Irq
spsr: 00000092 -----I- Arm Irq
Cycle: 129600697
Pipeline not refilled
Which is really weird beacuse I think it means we have reentrant IRQs except that the IRQ disable flag should be set for the duration of IRQ handling in Gradius (as far as I can tell)
Ooh, this might be a genuinely interesting bug. I've got a breakpoint on my "service interrupt" routine which fires when the return address is 0x0300_0904 (that dodgy R14). When that fires it's clear it comes from the IRQ servicing routine and that on the cycle it fires the CPSR IRQ Disable bit is set.
I was trying to nail down the implementation of the IRQ synchronizer in #55 and as part of that there's a 3 cycle IRQ sync delay, in my implementation the IRQ is only cancelled if the IRQ disable bit is set during the first 2 of those cycles.
Testing what happens if we allow the IRQ disable bit to affect IRQ all the way through to the last cycle and TADA Gradius gets in game.
Timer Countup and irq tests both still pass the same number (#55) after that change, not surprising. However Fleroviux's irq delay test has now stopped showing identical values as hardware.
I still don't think I have a really good handle on what happens during the sync process and what latching occurs for various IRQ registers so I'm not exactly sure what to make of this.
this is pretty much the only image I can find describing an IRQ synchronizer (and it's for the arm cortex)
Dumping out the core after 1500 frames I see it in a tight loop in IRQ mode running code in 0x0300_xxxx region doing two lines:
So we're stuck in a daft loop with IRQs disabled storing r1 (1) into 04000208 (IME) to enable interrupts.
At this point PPU interrupts are enabled and all are active for some reason, presumably because the code to clear them hasn't run.
Presumably LR is wrong here, we'd expect that this code is part of the ISR and that it returns to BIOS with BX LR at the end of executing. It does the BX LR but LR is just an earlier instruction! So what splatted LR??