atari800 / atari800

Atari 8-bit computer and 5200 console emulator
https://atari800.github.io/
GNU General Public License v2.0
349 stars 96 forks source link

DLI timing problem #79

Closed Sasatari closed 5 months ago

Sasatari commented 4 years ago

At AtariAge, someone mentioned, that the demo "Second Life Syndrome" (xex-file zipped and attached) is not running on Atari800MacX and atari800 (which is the base for Atari800MacX). The demo runs fine on real hardware and on the Altirra-Emulator by phaeron (Windows only).

phaeron seems to have found the issue:

"Under normal circumstances, a DLI handler would try to begin executing at cycle 10, two cycles after the end of the LMS fetches, whenever the next instruction border occurs. However, if a taken branch is right before that, it delays interrupt acknowledgment and in this case it allows the DEC $1827 instruction that dmsc noted above to sneak in. This instruction is critical in this case because (1) the DLI routine is too long and keeps interrupting itself and (2) the DLI is also responsible for revectoring the NMI vector to the VBI handler based on this counter. With that instruction being skipped, the DLI code doesn't ever branch to the code to set up the VBI. The CPU does eventually run all of the decrements when the stack unwinds at the end of the frame when the DLIs stop, but all of the decrements run back-to-back without the checks in between, so the X=$00 condition is never seen by the branch."

Find the full topic there: https://atariage.com/forums/topic/300257-second-life-demo-lamers-on-atari800mac-501/ LMS.XEX.zip

dmsc commented 4 years ago

The attached is a little test program, assemble with MADS, after removing the ".txt" extension: int-tst.asm.txt

This is a picture of my NTSC Atari 800XL: atari-ntsc

And this is what is produced by current atari800: atari001

Notice that the last DLI is displaced in comparison with the original, as the DLI fires too earlier.

Altirra produces a picture that is identical with my Atari: altirra000

dmsc commented 4 years ago

Here is also the ATR and XEX for above program, in case you want to do more testing: int-test.zip

By tracing the execution, this is the cycles for each DLI as seen by atari800:

Scan Line Cycle Before DLI Cycle In DLI BPLs before
79 14 21 7
82 14 21 7
95 13 20 7
104 12 19 6

While this is the info from Altirra:

Scan Line Cycle Before DLI Cycle In DLI BPLs before
79 12 19 7
82 12 19 7
95 11 18 7
104 13 21 7

Cycle Before DLI is the cycle counter after the last instruction executed before the DLI, Cycle In DLI is the cycle when first DLI instruction begins and BPL before is the number of BPL instructions executed before the DLI on each line

As you see, the difference is that Atari800 executed one BPL less before the DLI, executing the DLI 3 CPU cycles earlier.

mikrosk commented 5 months ago

It took only slightly more than four years to get this one fixed. :-)

With my fix @dmsc's test program's output looks like this:

nmi

So it's not identical to real hardware yet but it's definitely not worse. And most importantly, it runs @Sasatari's Second Life Syndrome!

mikrosk commented 5 months ago

Damn, it took only two days to find a regression. Rotten Juice demo's intro logo (M.E.C.) is broken after my commit. Will have to investigate.

mikrosk commented 5 months ago

I don't have good news. It seems that this line (with newcycleexact) https://github.com/atari800/atari800/blob/f1386be491e506de1e6393e587cc236025fbe466/src/antic.c#L3023 and this line (without) https://github.com/atari800/atari800/blob/f1386be491e506de1e6393e587cc236025fbe466/src/antic.c#L3038

are the culprit. Unfortunately, when changing the value to FALSE, Rotten Juice's logo looks OK but Second Life Syndrome doesn't run anymore.

This is how the corrupted logo looks like (note the two horizontal lines): mec

I don't the knowledge to debug this deeper (it looks somewhat similar to #163 but it's certainly something unrelated) so unless our experts like @pfusik or @kr0tki decide to take a look, I have to ask you @joysfera what to do with this one before the release. I would argue it's better to have a non-working demo fixed but knowingly introducing a regression sounds really awful.

pfusik commented 5 months ago

Special-casing branches only looks bad. It probably affects all instructions with variable execution time, so indexed modes that cross page boundary as well.

I am NOT back to Atari800 development, so don't count on me. :)

dmsc commented 5 months ago

Hi!

Please, try my branch at https://github.com/dmsc/atari800/tree/try-fix-79

I think the problem with your approach is that you always add one instruction before a NMI, but you should only add one cycle before the NMI.

So, my patch at https://github.com/dmsc/atari800/commit/1f515707588f920f5965e7e02da9c4114a7e9c90 undoes a lot of the logic in your patch and simply adds the cycle in the CPU_GO call before the CPU_NMI.

Have Fun!

mikrosk commented 5 months ago

It seems you are completely right! I even copy&pasted the paragraph from @phaeron1024's hardware guide into my commit: https://github.com/atari800/atari800/blob/f1386be491e506de1e6393e587cc236025fbe466/src/cpu.c#L285-L289 which clearly says the delay is one cycle, resulting in the processing the next instruction, not the other way around.

Clearly, I should have included you in the list of local experts! ;) I'll test it tonight and merge your (much more elegant!) change.

mikrosk commented 5 months ago

@dmsc nearly there. ;) Rotten Juice and the test program works however the reason of this bug report - the S.L.S. demo - doesn't. I think it is because your approach isn't 100% valid: you add one cycle to CPU_GO's limit however this limit is applied only on the next instruction after NMI.

Strangely, my attempts at calling CPU_GO one more time before CPU_NMI haven't brought desired results yet (either SLS doesn't work or R.J. doesn't...).

dmsc commented 5 months ago

Ok, try again :-) https://github.com/atari800/atari800/compare/master...dmsc:atari800:try-fix-79

I tested, and seems to work:

Have Fun!

mikrosk commented 5 months ago

Thanks @dmsc, I'll review & test it tonight. If I confirm your implementation, I'm thinking to adapt it similarly as done for this one: https://github.com/atari800/atari800/blob/f1386be491e506de1e6393e587cc236025fbe466/src/cpu.c#L565-L596 i.e. we will have both delays at one place.

Btw my experiments were quite similar but perhaps I was too tired and put ANTIC_xpos + 1 instead of ANTIC_xpos_limit + 1 as the limit so many thanks for your patch.

dmsc commented 5 months ago

Hi!

I don't think it could be integrated inside the CPU_GO() easily, as you don't control the NMI logic from inside the CPU code, and sharing another state there will be more work (slowing down and complicating the code). As it is now, I think the CPU code is already a mess 😛

After this is solved, I have another proposal: to join the CPU_GO() + CPU_NMI() in a single call, passing the target cycle to the CPU_NMI() function, like this:

/* Triggers a Non-Maskable Interrupt at cycle "limit" */
void CPU_NMI(int limit)
{
    UBYTE S;
    UBYTE data;

    /* Advance the CPU until the limit */
    CPU_GO(limit);

    /* Advance the CPU until the limit */
    CPU_GO(limit);

    /* If last instruction was a taken branch that did not cross a page,
     * we need to delay exactly 1 more cycle for the NMI */
    if(CPU_delay_nmi)
        CPU_GO(limit + 1);

    /* Process the NMI now */
    S = CPU_regS;
    PHW(CPU_regPC);
    PHPB0;
    CPU_SetI;
    CPU_regPC = MEMORY_dGetWordAligned(0xfffa);
    CPU_regS = S;
    ANTIC_xpos += 7; /* handling an interrupt by 6502 takes 7 cycles */
    INC_RET_NESTING;
}

This requieres no modification to any outside code, and will always work independent of the NMI source. With this, you simplify all CPU_NMI() callers by passing the cycle count directly.

Have Fun!

mikrosk commented 5 months ago

@dmsc I have slightly adjusted your patch & fixed the Falcon code. I also like your idea with CPU_NMI, so that is included as well. Once again - thanks for looking into it, now this issue can (hopefully) finally rest.