MiSTer-devel / C64_MiSTer

112 stars 56 forks source link

C64 - timing(?) problem in The Space is Broken demo #160

Open F-RX opened 10 months ago

F-RX commented 10 months ago

I found a C64 core synchronization problem in Space is Broken - great demo by FAIRLIGHT. In some scenes, artifacts appear that are not visible on real hardware or in the VICE emulator. https://csdb.dk/release/?id=234768 https://www.youtube.com/watch?v=M2fCRbLwU74 On C128 the demo works fine.

1 2 3 4 5

gyurco commented 9 months ago

https://github.com/mist-devel/c64/commit/fe99275603ce1f7dfdb5d733893297128b711bfb

sorgelig commented 9 months ago

@gyurco thanks!

sorgelig commented 9 months ago

should be fixed in upcoming unstable/nightly build

F-RX commented 9 months ago

Thanx a lot :)

paich64 commented 9 months ago

@gyurco @sorgelig @F-RX I was testing an our established list of demos before considering pushing this fix in the Mega65 C64 core (which is a port of the mister core). We have a few demos having visual glitches (the same as on mister). So i compared the behviour with and without this fix on mister, and it obviously introduces a regression in at least one demo :

https://csdb.dk/release/?id=198283

in this XMAS mega demo we have a visual glitch on both Mister and Mega65 C64 core which you can see here :

https://www.youtube.com/watch?v=OUUgmUE7avs&t=17s

in the bottom right corner 4 black pixels are showing up during a few seconds, while the expected behaviour is the following :

https://www.youtube.com/watch?v=p0ddV07WC8o&t=561s

With your new fix, we now get an even more unexpected behaviour : The top and bottom borders which are supposed to be opened, now get closed and the Star at the top of the screen gets now half hidden :

hidden_start

Can you guess what could be the link between your fix and this unexpected behaviour ?

Thanks, Olivier.

gyurco commented 9 months ago

Try this: https://github.com/mist-devel/c64/commit/17c452b94a37c4eda3d4080c0daf018c6a1a022f

paich64 commented 9 months ago

@sorgelig could you backport to mister so that i can test on mister ? thanks !

makigumo commented 9 months ago

Thanks, this seems to fix the XMAS demo issue. Below is the patch adapted to mister. Apply with git am 17c452b94a37c4eda3d4080c0daf018c6a1a022f.patch.

17c452b94a37c4eda3d4080c0daf018c6a1a022f.patch

paich64 commented 9 months ago

@makigumo many thanks will ask the person generating the core for me to provide a fixed build

sorgelig commented 9 months ago

i've pushed the changes

paich64 commented 9 months ago

@sorgelig @gyurco @makigumo Many thanks, Xmas Megademo 2020 is back to its original state (the little glitch is still here but top/bottom borders are now again opened) and Fairlight demo is still fixed.

I will keep testing demos on the Mister and report if i find a demo with regressions.

paich64 commented 9 months ago

@gyurco @sorgelig @F-RX @sy2002 while the fix seems to have no side defect on a few other tested demo, I have to report this regression :

Plush 1997 +H2K (https://csdb.dk/release/?id=11755) => while this demo has allready known issues :

The fix introduced another slight glitch :

h2k+

On the picture, the 2 circled rectangles will shortly blink while it does not happen without the fix.

I know that this demo is old, has already known glitches, but it's an additional one, and even it's not the best demo out there, we can't garantee for sure that it will no impact other demos. I also recognize it's probably very complex to figure out why we have another additional slight glitch.

In such a situation it's quite complex to decide what to do and i won't have time to run our full 180 demos benchmark. This will be done when we are done with our own C64 core refactoring on Mega65 [we first want to ensure our refactoring doesn't introduce new issues and then only we will consider this fix and run of course the whole 180 demos benchmar].

If you have any idea about what could produce this additional glitch ...

Olivier.

paich64 commented 9 months ago

@sorgelig @gyurco @makigumo @sy2002 To be as much clear as possible :

Then +H2K was broken by FIX1.

though it was not fixed by FIX2 as Xmas Mega demo was.

Also, while so far i only tested our 180 demos bench on the Mega65 C64 core i will probably consider at some point to perform the same tests with Mister Core each time you provide a Fix. I just have to include this in my schedule.

To be as much clear as possible : We have a huge milestone in the pipe for the Mega65 C64 core. We are currently refactoring it and we want to be sure that in case of regression it's only related to our refactoring and not related to this addional Fix you have provided. When the refactoring is done and we know we did not introduce regressions, then we will obviously test this Fix with the MEGA65 C64 core. I will have to run again an inspect each and every single demo of this 180 demos bench.

I will do this for the Mega65 C64 core and the C64 Mister core.

Olivier.

gyurco commented 9 months ago

This should fix the regression: https://github.com/mist-devel/c64/commit/a04e0d9fa1a241def0957be007ce161963ca0f25

The shaking at the beginning is a problem of a stable raster, I'm not sure how to fix it without breaking everything else (if the raster interrupt is fired one cycle earlier or later, then the scene is fixed, but a lot more is broken).

sorgelig commented 9 months ago

This should fix the regression: mist-devel/c64@a04e0d9

pushed

paich64 commented 9 months ago

@gyurco @sorgelig Many thanks, will ask my team mate to generate the core and will test.

paich64 commented 9 months ago

@sorgelig obviously there have been 2 consecutive commits (1hour interval between both) generated. I guess i have to use the very last one ? Thanks Olivier.

sorgelig commented 9 months ago

the last one is about NTSC sprites fix (for mario bros). previous one is what mentioned here.

paich64 commented 9 months ago

@gyurco @sorgelig @sy2002 I got the core generated and H2K+ glitch introduced by previous fix is gone. Surprisingly (And frankly i don't have any clue), but the disk swap is now working.

Basically here are the results of this "fix on top of fix"

Fairlight 2023 The Space is broken => Fixed and no regression Bonzai 2020 Christmas Megademo (coop demo) => Fixed and no regression Plush 1997 +H2K => Fixed and no regression Reflex 1994 Mathematica => No regression Triad 2013 Revolved => No regression Censor Design 2016 Wonderland XIII => No regression
Censor Design 2013 Wonderland XII => No regression

to as honest as possible, this last fix has probably amplified a bit the first glitch of +H2K, but this glitch has always existed, so i would say, no big deal. So far and considering This nice demo from fairlight is fixed and also other demos are not suffering from it, i think we can forget about the already existing glitches in +H2K (which are anyway hard to fix without impacting other demos).

Because The core is pretty much really good when it comes to Demo accuracy, i would like to keep testing other demos with this very last fix and ensure they are not impacted.

So @sorgelig @gyurco are you ok if 1) we do not close this issue yet. 2) I keep posting regression tests results with demos from my 180 demos bench 3) We do not attempt to introduce other fixes that could impact demos stabilities untill i have completed these tests

?

If you are Ok, i will keep posting the results over time in this issue. When i'm done we can consider the issue is fixed (If you are ok of course).

Testing demos is quite time consuming (as i really watch them carefully from beginning to end) and if i suspect something is wrong i re-run it with the previous core to ensure what i see also existed in previous core.

But if you are ok i will keep posting here on a regular basis.

Olivier.

makigumo commented 9 months ago

On the topic of vic-ii inaccuracies. Has it been considered to use or check against Randy Rossi’s VICII-Kawari implementation? His implementation should be on par with VICE's. Which should fix some issues.

paich64 commented 9 months ago

@makigumo this is where it becomes also tricky : Randy Rossi’s [VICII-Kawari] implementation is strictly an implementation of VIC-II 6567/6569 chips and not 8562/8565. I has asked him about why it was not available for 8562/8565 and he had told me that the timings were different and that it would have required to re-do lots of tests. Now i don't know which VIC model is implemented in Mist/Mister core. and i don't know which exact timings differences he was talking about. Because if we look at VICII main document http://www.zimmers.net/cbmpics/cbm/c64/vic-ii.txt I d'ont really see where timings are different. It may not be VICII accuracy itself, but the timings as a whole. The interrupts system has been reworked in 2022 by @gyurco and it has drastically improved the accuracy (there are a few demos which have been drastically improved thanks to his work). But maybe there are still edge cases where we can't get the whole timings accurate.

I also have to admit that my knowledge when it comes to this whole topic is limited. A friend of mine has confirmed his VICII-Kawari is really accurate, but again, it's just a VICII implementation on top of the real hardware which in itself is perfectly timed. In the C64 core it's more that the VICII itself.

Last but not the least, among these 180 demos i'm talking about a huge majority works perfectly. So i tend to believe that if the VICII fpga implementation was not accurate then we would have way more glitches than what we have now. That's why i'm talking about timings as a "whole", not from a VICII perspective only, but from a C64 perspective.

makigumo commented 9 months ago

The timing differences are probably only an issue on the hardware side, e.g. speaking to the other chips. So my uneducated guess would be that they don't really matter for our use-case.

The current core itself is currently only switchable between PAL/NTSC (6569/6567R8) with no mention or support of the 8562/8565 variants. So we wouldn't lose any functionality.

Testing the Kawari implementation might reveal if the (very few!) remaining graphical issues with the core are with the current VICII implementation or lie elsewhere.

eriks5 commented 9 months ago

The timing differences are visual. This Vice test shows them: https://sourceforge.net/p/vice-emu/code/HEAD/tree/testprogs/VICII/vicii_timing/

Compare vicii_reg_timing.prg.png with vicii_reg_timing.prg-8565.png and you can easily spot them.

As mentioned by the OP the C128 core didn't have the original issue probably because of the timing changes I made. I looked into these timing changes for the C128 core because the C128's VIC is an 856x and the changes were necessary to get the C128 demo, Risen from Oblivion, working, but there's still one small visual timing related glitch with that demo in the C128 core.

gyurco's fix is different from mine, and better, so I'll update the C128 core's VIC with these changes soon.

paich64 commented 9 months ago

@eriks5 thanks for the insights and the clarification about the timing differences between both models. Do you think the glitches are only related to the VICII timings implementations ? Is my uneducated guess about the interrupts dispatcher which might not be 100% accurate valid or not relevant at all ?

eriks5 commented 9 months ago

My (mostly) uneducated guess is that the timing changes are related to the change from NMOS to HMOS-II technology of the chips, so all transistors are of a different type with different timing characteristics.

I read somewhere that the gray bar colour glitch you can see in the 856x reference pictures is temperature related. Some chips don't show them when they're hot, while other chips always show them, so to me this seems to be a transistor-speed related thing.

gyurco commented 9 months ago

The one-pixel glitches are not due to CPU, but the VIC-II has plenty of possibilities to use specific registers at the specific (pixel) time, which is shown by the vicII-reg-timing tests (in some videos, Bil Herd and Al Charpentier even said there are a lot of timing violations inside the chip, so a lot of signal delays are not even intentional). I looked into the Kawari-II implementation a while ago, but it's not a drop-in replacement into the FPGA64 design, so I didn't check it further.

The glitch at the beginning of the demo is a problem of the raster interrupt. The demo uses the double raster interrupt method to have a stable raster, the one described here: https://codebase64.org/doku.php?id=base:making_stable_raster_routines

As written: "In the place where the second raster interrupt will occur, there will be 2-byte instructions in the first interrupt routine. In this way, the beginning of the next raster interrupt will be off at most by one cycle. Some coders might not care about this one cycle, but if you can do it right, why wouldn't you do it right until the end? "

So at the end there are two possible outcome of the second interrupt, however only one of them makes the demo good, the other one introduces the glitch. I wonder why doesn't it happen on real HW.

eriks5 commented 9 months ago

The gray bar colour glitches showing in the 856x timing reference picture I linked above are a perfect example of those internal timing violations: I suspect the colours are fetched from the databus while the internal bus is still high, causing the pixel to be colour 15 (gray) until the databus latches are stable one pixel later. I had to implement this glitch in the C128 because the Risen from Oblivion demo uses it for the starfield part.

paich64 commented 9 months ago

@gyurco It's stable on Real hardware, on Ultimate 64 (closed source fpga implementation), so i suspect the raster interrupt is in any case stabilized, otherwise don't you think it would glitch on real hardware ? What makes you think the code is not stable in itself ? But indeed this demo is really strange when it comes to its behaviour (like a few other ones). It's really surprising as it glitches a lot while many other "heavy" demos all relying a very precise timings perfectly work. It could also be related to the demo interracting with the 1541 and the loader used in it. If for any reason it's very sensible to 1541 behaviour that could explain. I've seen demos completely failing on very old 1541 true device (glitch, demo broken) simply because of it's sensitivity to 1541 device which of course changes over time when the hardware gets old. If the code cycle accuracy is dependant on 1541 device access then it can be a reason for the demo misbehaving. It's an old demo, and i guess unless we have the source code to understand what it's doing, it's going to be really hard to fix. I would say that we should not intend to have it work at any price. Meaning, what matters is the global stability and accuracy of implementation.

gyurco commented 9 months ago

I didn't imply the code is bad and not stable, but I just don't understand why does it work: The second interrupt handler until the final D012 check (CMP $D012) is 55 cycles. The interrupt handling takes 7 cycles. The initial delay of the interrupt is 2 or 3 cycles (depend on which phase of the NOP instruction the IRQ arrives). Thats 55+7+2(3) = 64(65) cycles. One line is 63 cycles, so the CMP $D012 always checks the next line - useless. Even C64 debugger shows the end of the CMP at X=$8 or X=$10, the next line, yet the next jump acts on the value of the previous line when X was $8. However delaying rasterY to be returned when $D012 is read breaks a lot of things. CMP reads $D012 at its last cycle for sure, that's good in T65. So I simply have no idea what's wrong.

If somebody with more C64 expertise can explain it, then it would be possible to fix.

Here's the second interrupt handler code, from $afd0-$afe5 = 55 cycles ($afd5-$afd6 loop is 19 cycles). image

paich64 commented 9 months ago

@gyurco i do not fully get you. Regarding the timings 55+7+2(3), it's ok 👍

Instr Cyc TotCyc pla 4 4 pla 4 8 pla 4 12 ldx 2 14 X=4
dex 2 16 X=3
bne 2+1 19 dex 2 21 X=2
bne 2+1 24 dex 2 26 X=1 bne 2+1 29 dex 2 31 X=0
bne 2 33 nop 2 35 nop 2 37 lda 2 39 sta 4 43 sta 4 47 lda 4 51 cmp 4 55 => so indeed considering the handling (7 cycles) + 2(3) cycles , we are at 64(65) cycles when cmp completes. However, just before CMP operates we are at (64-4)=60 or (65-4)=61, so $d012 has not changed yet isn't ? and we are still not on next raster line when at cycle 60. Are we sure CMP reads $d012 at its last cycle ? Because to compare it needs to first read it isn't it ?

paich64 commented 9 months ago

@gyurco I meant :

just before CMP operates we are at (64-4)=60 or (65-4)=61

if we are at (64-4)=60 then we are at cycle 64(0) when CMP completes and $d012 has just changed (but was this new value read by CMP ?) and if we are at (65-4)=61 then we are at cycle 65 (1) when CMP completes, $d012 has changed at cycle 0 and here indeed we can assume the new value for $d012 has been read.

paich64 commented 9 months ago

@gyurco you are right $d012 is read during last cycle of CMP so as you said, obviously useless here.

So now the thing which is not clear to me : Can you please elaborate further on what you are seeing and which is not ok to you ? I'm not sure i got "yet the next jump acts on the value of the previous line when X was $8". what is the behaviour you are seeing and what should be the expected behaviour ? I can ask C64 guru demo coders from the scene (I know a few of them) but i would need further details to do so. Thanks

gyurco commented 9 months ago

yet the next jump acts on the value of the previous line when X was $8 In the debugger, stepping by instruction $afe8 stops when X=$8, yet the next jump takes 3 cycles as the end of the previous CMP happened at the end of the previous line. When $afe8 happens at X=$10, then the jump takes 2 cycles. CMP's last cycle seems to be always at cycle 1 or 2 of the next line (when D012 is actually read).

(The whole point of the stabilization is to insert 1 extra cycle by the BEQ when D012 read at the end of the previous line - equal what read by LDA -, so the next instruction will always happen at the same time).

gyurco commented 9 months ago

However one interesting thing is that this stabilization routine runs when the Y counter resets to 0 ($137->0), maybe the transition have an effect on reading $D012? That should be documented somewhere.

paich64 commented 9 months ago

@gyurco sorry for asking for confirmation. When you say X=$10 are you talking about X register or X raster beam position ? I guess Y counter refers to Raster beam Y position also ?

gyurco commented 9 months ago

X raster beam position (8 = end of the first cycle, $10 = end of the second cycle, and so on...at least how I interpret the debugger display).

paich64 commented 9 months ago

@gyurco could this impact or maybe already implemented in the core code ? :
from http://www.zimmers.net/cbmpics/cbm/c64/vic-ii.txt :

"The following table describes the four interrupt sources and their bits in the latch and enable registers:" ......

The test for reaching the interrupt raster line is done in cycle 0 of every line (for line 0, in cycle 1).

gyurco commented 9 months ago

Ok, so that would be the missing piece, I'm implementing this ASAP.

gyurco commented 9 months ago

BTW, "The test for reaching the interrupt raster line is done in cycle 0 of every line (for line 0, in cycle 1)." doesn't help much, as the interrupt happens one line before. I think there must be also something with reading the rasterY value, too.

gyurco commented 9 months ago

Here's a patch to test:

diff --git a/rtl/video_vicII_656x_a.vhd b/rtl/video_vicII_656x_a.vhd
index 72a818a..7ac6741 100644
--- a/rtl/video_vicII_656x_a.vhd
+++ b/rtl/video_vicII_656x_a.vhd
@@ -142,6 +142,7 @@ architecture rtl of video_vicii_656x is
 -- Raster counters
        signal rasterX : unsigned(9 downto 0) := (others => '0');
        signal rasterY : unsigned(8 downto 0) := (others => '0');
+       signal rasterY_prev : unsigned(8 downto 0) := (others => '0');
        signal rasterY_next : unsigned(8 downto 0);
        signal cycleLast : boolean;
        signal rasterXDelay : unsigned(9 downto 0);
@@ -738,6 +739,7 @@ rasterCounters: process(clk, rasterX, rasterXDelay)
                        if phi = '1'
                        and enaData = '1'
                        and baSync = '0' then
+                               rasterY_prev <= rasterY;
                                if cycleLast then
                                        rasterY <= rasterY_next;
                                end if;
@@ -1648,7 +1650,7 @@ readRegisters: process(clk)
                                do <= MX(7)(8) & MX(6)(8) & MX(5)(8) & MX(4)(8)
                                & MX(3)(8) & MX(2)(8) & MX(1)(8) & MX(0)(8);
                        when "010001" => do <= rasterY(8) & ECM & BMM & DEN & RSEL & yscroll;
-                       when "010010" => do <= rasterY(7 downto 0);
+                       when "010010" => if rasterY = 0 then do <= rasterY_prev(7 downto 0); else do <= rasterY(7 downto 0); end if;
                        when "010011" => do <= lpX;
                        when "010100" => do <= lpY;
                        when "010101" => do <= ME;
paich64 commented 9 months ago

Thanks @gyurco i guess this is for mist ? Should i ask @makigumo to generate the patch for mister or ask @sorgelig to push it ? anyway i'll have to ask my team mate to generate the core as i'm not familiar with this :)

gyurco commented 9 months ago

Doesn't matter, the VIC-II's are mainly the same. However I doubt it's the correct way. Maybe rasterY should be reset to 0 one cycle later instead.

paich64 commented 9 months ago

If it rasterY resets to 0 one cycle later it means it keeps to max value one more cycle. Isn't this likely to have unexpected side effects ?

gyurco commented 9 months ago

If it rasterY resets to 0 one cycle later it means it keeps to max value one more cycle. Exactly that's what expected by the demo.

paich64 commented 9 months ago

@gyurco may i suggest we first test with your initial idea and if it solves the issue so that we stick to the expected true vic behaviour ?

gyurco commented 9 months ago

My initial idea works, but I'm not sure if it's the original workings. Delaying rasterY reset also solves the test for raster interrupt one cycle later at line 0, but messing with $D012 reading is not.

paich64 commented 9 months ago

what do you mean by "but messing with $D012 reading is not." ?

gyurco commented 9 months ago

My first patch doesn't solve the interrupt timing, it just change what a read of D012 will return.

paich64 commented 9 months ago

Ah ok, while your second idea is working as expected ?

gyurco commented 9 months ago

Works so far in this demo: https://github.com/mist-devel/c64/commit/0ffaba208a5beb9a2da7ebf0badd391dacdf51c8 But regression testing of other demos would be welcome.