ekeeke / Genesis-Plus-GX

An enhanced port of Genesis Plus - accurate & portable Sega 8/16 bit emulator
Other
676 stars 194 forks source link

[MCD] 2M graphics operation should not be able to write to Word RAM if the MCD does not have access #461

Closed DevsArchive closed 1 year ago

DevsArchive commented 1 year ago

It's been a while since I last tested this, but I believe if a graphics operation attempts to write to Word RAM in 2M mode, and it doesn't have access, it'll just wait until it has access to continue.

A few years ago, I wrote a small demo that erroneously gave the Main CPU Word RAM access after starting a graphics operation, and the result was said operation being really slow on hardware. The operation appeared to be allowed to continue, because I was also periodically checking if the operation was done via a command sent to the Sub CPU from the Main CPU, and the command handler I had at the time gave Sub CPU access to Word RAM every command.

ekeeke commented 1 year ago

Yes, this is not emulated because locking 68k cpu temporary is not easy to emulate with current codebase and this was not needed by any commercial game. It appears it is needed by one game though, for which there is an issue loggued here so I will eventually add it some day. I have some working code in my private development repository but it still needs some work as it breaks other things in this game later (and maybe other games).

I am closing this as it is already addressed by this issue (among many other things, this test ROM verifies access to word RAM is frozen on sub-cpu side if access is given to main-cpu)

ekeeke commented 1 year ago

Reopened as this one is a separated issue and is more related to graphics operation access (not tested by mcd verificator test ROM) than sub-cpu access

ekeeke commented 1 year ago

A few years ago, I wrote a small demo that erroneously gave the Main CPU Word RAM access after starting a graphics operation, and the result was said operation being really slow on hardware.

Do you still have that demo ? I have implemented some small changes to handle this and would like to verify if it works as expected ?

DevsArchive commented 1 year ago

Here. Don't mind the cheesy text and all that stuff, it was a birthday gift for a friend from then. They've been okay with this being shared around.

I may as well show you the offending code, since I still have the source code on me. On the Sub CPU side, this was my command handler:

.WaitCommand:
    moveq   #0,d0
    move.b  GA_MAIN_FLAG.w,d0       ; Get command ID
    beq.s   .WaitCommand            ; Wait if we if the ID is 0

.WaitWRAM:
    btst    #1,GA_MEM_MODE+1.w      ; Wait for Word RAM access
    beq.s   .WaitWRAM

    move.b  #"B",GA_SUB_FLAG.w      ; Mark as busy

.WaitMain3:
    tst.b   GA_MAIN_FLAG.w          ; Is the Main CPU ready to send commands?
    bne.s   .WaitMain3          ; If not, branch

    add.w   d0,d0               ; Go to command
    add.w   d0,d0
    jsr .Commands-4(pc,d0.w)

.SendWRAM2:                 ; Give the Main CPU Word RAM
    bset    #1,GA_MEM_MODE+1.w
    beq.s   .SendWRAM2

    move.b  #"R",GA_SUB_FLAG.w      ; Mark as ready

    bra.s   .WaitCommand            ; Loop

Which, you can see, it waits for Word RAM access, then runs the command, then immediately gives it back to the Main CPU, which is not ideal for the graphics operation.

On the Main CPU side, the graphics operation is started and checked via this function:

StoneyBG:
    moveq   #5,d0
    bsr.w   SubCPUCmd
    tst.w   GA_STAT_0
    bne.w   .Skip
    tst.b   r_Stone_Art.w
    beq.s   .Do
    clr.b   r_Stone_Art.w
    dma68k  WORDRAM_2M+$20002, $20, $6400, VRAM

.Do:
    move.w  #2,GA_CMD_0
    move.w  #$10000/4,GA_CMD_2
    move.w  #256,GA_CMD_4
    move.w  #192,GA_CMD_6
    move.w  #$20000/4,GA_CMD_8
    moveq   #3,d0
    bsr.w   SubCPUCmd

    movea.l r_Script.w,a0
    tst.w   (a0)
    bpl.s   .NoReset
    lea StoneyScr(pc),a0

.NoReset:
    move.w  #$30000/4,GA_CMD_0
    move.w  (a0)+,GA_CMD_2
    move.w  (a0)+,GA_CMD_4
    move.w  r_Angle.w,GA_CMD_6
    move.w  #0,GA_CMD_8
    move.w  #0,GA_CMD_A
    move.w  #224/2,GA_CMD_C
    move.w  #192/2,GA_CMD_E
    moveq   #4,d0
    bsr.w   SubCPUCmd

    move.l  a0,r_Script.w

    st  r_Stone_Art.w
    addq.w  #2,r_Angle.w

.Skip:
    rts

This is run every frame. Sub CPU command ID 5 checks if the graphics operation is done, and is really what is really allowing the graphics operation to continue, albeit very slowly, instead of just straight up halting. The rest is just copying the rendered graphics and setting up the next operation.

ekeeke commented 1 year ago

This is really curious because the following code above

.SendWRAM2:                 ; Give the Main CPU Word RAM
    bset    #1,GA_MEM_MODE+1.w
    beq.s   .SendWRAM2

actually does NOT return Word RAM to Main CPU

Indeed, this code sets bit 1 i.e DMNA bit of memory mode register, which is not writable by Sub CPU (if you want to return Word RAM to Main CPU you need to set bit 0 i.e RET bit).

This is confirmed by running your demo in emulator because I never see Word RAM being returned to Main CPU and graphics operation is running normally despite my changes to address this issue.

This means that the slowdown on real hardware is not caused by Word RAM being returned to Main CPU during graphics operation but by DMA accessing Word RAM while it is assigned to Sub CPU. I would have thought that in this case, reading Word RAM would return garbage (or open bus) but it seems that on real hardware, Main CPU side (which includes VDP DMA) can still access Word RAM when it is assigned to Sub CPU, and if Sub CPU side (which includes graphics operation) also tries to access Word RAM at the same time, it will be delayed. This is actually different (and more complex to emulate) from graphics operation (or Sub CPU) being halted when accessing word RAM while it is assigned to Main CPU.

DevsArchive commented 1 year ago

Ack, curse you, old noob code. I missed that tidbit. That's super interesting, though.

ekeeke commented 1 year ago

Are you sure that linked demo runs on real hardware ? There is one test in mcd verificator that verifies writes to Word RAM from Main CPU do nothing when it is assigned to Sub CPU (it does not verify the effect of VDP DMA or Main CPU reads though).

I don't have a Mega CD unit to test it, only an Everdrive Pro flashcart (but this particular demo halts at the Sega logo, only playing the audio track, not sure if it is a bug on demo side or emulation inaccuracy on flashcart side)

DevsArchive commented 1 year ago

Do forgive me for any hassle and my overall hastiness, as I've been trying to remember things from 4 years ago, and evidently, I am no good at that. I digress, though. I did give it a test, and indeed, it doesn't boot up on an actual MCD.

However, I also now do remember fixing the boot issue and encountering this one, so I went and applied it and ran a new test, and got this, which is exactly as expected: corrupted data.

I must have also fixed the Word RAM access bug, too. I went ahead and did that and got that tested, and ended up with this and this, which is exactly the behavior that I remember seeing and what I was reporting here. Of course, you never saw this when testing your new code as the version I sent had the bugged Word RAM access set code.

Here's a link to the "fixed" demo.

Definitely would've been a lot of help on my end to actually remember things correctly! @.@

ekeeke commented 1 year ago

No problem, thanks for confirming the expected behavior for reads as well. I will test this new demo to see how it behaves in emulator.

Out of curiosity, do you remember what was the cause of the demo hanging at Sega logo ? Because it does not hang in the emulator so I'm wondering what other hardware behavior could be unemulated or misemulated.

DevsArchive commented 1 year ago

The cause was the Sub CPU trying to load the file into Word RAM during the Sega logo screen in an attempt to speed up the boot time. Think that's just an access rights issue causing the Sub CPU to lock up during CDCTRN.

ekeeke commented 1 year ago

https://github.com/ekeeke/Genesis-Plus-GX/commit/ea8d2991233b1eb41159eaaa955d736f5fa55752) committed to emulate GFX processing halt while word RAM is allocated to Main CPU in 2M mode. Here is a short video capture to show the slowdown with latest demo posted above:

https://user-images.githubusercontent.com/717091/194745250-2c085b40-d696-47f0-b000-a7a45bd9ed9b.mp4

DevsArchive commented 1 year ago

Yup, that's generally how it should behave, though it does appear to be quite a bit too fast, but I am aware of the situation with the codebase and timing issues, so I won't hold it against you. All in all, thanks!

ekeeke commented 1 year ago

Yes, GFX processing timings are only approximated for the moment as this part is pretty much undocumented so they are probably sligthly faster than real CD hardware.