MJoergen / C64MEGA65

Commodore 64 core for the MEGA65 based on the MiSTer FPGA C64 core
GNU General Public License v3.0
23 stars 4 forks source link

Graphics glitch (related to REU?): Attack of the PETSCII Robots #86

Open Schwefelholz opened 10 months ago

Schwefelholz commented 10 months ago

I guess I found a graphics glitch in C64MEGA65 V5.

Playing the REU version of David Murrays "Attack of the PETSCII Robots", sooner or later the vertical sliding doors evolve a graphics glitch. I was able to reproduce this with the provided D64 images as well as with the game being provided on physical 3.5" and physical 5.25" floppies.

On an Ultimate-64 I cannot reproduce the glitch, so I assume it is actually related to the C64MEGA65 core.

Unfortunately, I don't have any additional information to provide. If further information is needed, just let me know.

sy2002 commented 10 months ago

Thank you, @Schwefelholz for reporting this issue. Highly appreciated. For sure an interesting one, since it seems to be related to our simulation of the REU which indeed might not be 100% timing accurate given the HyperRAM's quirks. It will take us a while until we can start debugging this issue (probably not before next year) so I would like to preserve as much information as possible about it here in this issue so that as soon as we start looking into it, we have everything handy. So here are a few questions:

  1. You activated the "Simulate 1750 REU 512 KB" setting in the Help menu?

  2. When you used physical media, I assume you used the core's IEC capability and switched on "IEC: Use hardware port"?

  3. Can you use your mobile phone to make a short video and upload it to this issue? (Or share a YouTube link that leads to the glitch?)

  4. Does the glitch only happen in the REU version / REU mode?

  5. Are other REU enabled games working for you, for example "Sonic the Hedgehog" and "Super Mario C64" in the respective REU mode of the game?

  6. We are aware that our REU/HyperRAM implementation leads to strange effects for some people while it does not for others. Just to double-check if you are affected by a glitch that might look unrelated but maybe is not (I am talking of this one here: https://github.com/MJoergen/C64MEGA65/issues/55): Can you please run the demo "Treu Love": https://csdb.dk/release/?id=144105 (make sure you use the file called TreuLove_ForReal1750Reu.d64)

  7. Do you by chance own a MiSTer or do you know anybody who owns a MiSTer and can you check (or make someone check) if the issue happens there, too?

sy2002 commented 10 months ago

@AmokPhaze101: Just to double-check if it is not an issue that happens on certain machines (due to HyperRAM tolerances etc.) while it does not on others: Can you try to reproduce on your MEGA65?

Schwefelholz commented 10 months ago
1. You activated the "Simulate 1750 REU 512 KB" setting in the Help menu?

Yes, correct.

2. When you used physical media, I assume you used the core's IEC capability and switched on "IEC: Use hardware port"?

True, as well. I'd like to point out that the glitch is also there using D64 disk images of the game. Not related to using real floppy hardware.

3. Can you use your mobile phone to make a short video and upload it to this issue? (Or share a YouTube link that leads to the glitch?)

See attachments. https://github.com/MJoergen/C64MEGA65/assets/6893444/e2453d88-2d08-4e9e-8e0f-d633aea79db7 https://github.com/MJoergen/C64MEGA65/assets/6893444/8772f52c-c53b-46fb-af83-2e49ec5d14f5

4. Does the glitch only happen in the REU version / REU mode?

Yes, I think so. Haven't seen it on the non-REU version, yet.

5. Are other REU enabled games working for you, for example "Sonic the Hedgehog" and "Super Mario C64" in the respective REU mode of the game?

Haven't tried any, yet. Will try to give them a shot.

6. We are aware that our REU/HyperRAM implementation leads to strange effects for some people while it does not for others. Just to double-check if you are affected by a glitch that might look unrelated but maybe is not (I am talking of this one here: [Very rare REU/HyperRAM issues: TreuLove  #55](https://github.com/MJoergen/C64MEGA65/issues/55)): Can you please run the demo "Treu Love": https://csdb.dk/release/?id=144105  (make sure you use the file called `TreuLove_ForReal1750Reu.d64`)

I'm not sure what to look for here. Generally, the demo is working. I see a few artefacts, though. Will try to compare to it running on my Ultimate-64.

7. Do you by chance own a MiSTer or do you know anybody who owns a MiSTer and can you check (or make someone check) if the issue happens there, too?

I don't own a Mister, and don't know anybody who does. What I have is a SiDi by Manuferhi. I will check out there as well (if it supports an REU).

sy2002 commented 10 months ago

@Schwefelholz Thank you investing time to help and for all your feedback and also for the very enlightening videos showing the glitch. As you confirmed that it does not happen in the "non-REU" mode, we will assume that this is a REU related glitch. Looking forward to learn if Sonic and Mario work for you and looking forward to @AmokPaze101's feedback as he has a "known good" HyperRAM/REU in his MEGA65. (Disclaimer: If it works for AmokPhaze101 using his "known good" HyperRAM this would not mean that your HyperRAM is bad from a hardware perspective. It would be more likely that we need to fine-tune our implementation of the HyperRAM controler or the REU.)

Schwefelholz commented 10 months ago
  1. We are aware that our REU/HyperRAM implementation leads to strange effects for some people while it does not for others. Just to double-check if you are affected by a glitch that might look unrelated but maybe is not (I am talking of this one here: Very rare REU/HyperRAM issues: TreuLove #55): Can you please run the demo "Treu Love": https://csdb.dk/release/?id=144105 (make sure you use the file called TreuLove_ForReal1750Reu.d64)

I'm not sure what to look for here. Generally, the demo is working. I see a few artefacts, though. Will try to compare to it running on my Ultimate-64.

Actually, the demo on the C64MEGA65 does not differ much from what I see on my Ultimate-64.

Schwefelholz commented 10 months ago

On the Sidi (which seems to be closely related to the Mist, this glitch does not seem to be present. Not sure if this is of any help.

paich64 commented 10 months ago

@Schwefelholz, AmokPhaze101 here, i will try to reproduce this evening. Thanks for the videos, it's clear what is wrong. Any estimation about how much time it takes when you play the game to start having these glithces ? Thanks.

Schwefelholz commented 10 months ago

@paich64 , on the occasion of the two videos, it was right from the start. Sometimes it takes a few minutes.

paich64 commented 10 months ago

@sy2002 @Schwefelholz I have bought the REU version as well as the non REU version. I have played a few games, each during at least 15minutes, specifically with the REU version, and i'm afraid, but i can't reproduce this glitch with my Mega65. @sy2002 it looks like we are one more time hitting an edge case. I will give it additional tries later today.

sy2002 commented 10 months ago

@paich64 and @Schwefelholz thank you both for your support. Then my initial theory, that our REU/HyperRAM code is not stable on all MEGA65 revisions (given different HyperRAM revisions inside these machines) might be true and this issue indeed might be related to this issue: https://github.com/MJoergen/C64MEGA65/issues/55

As soon as we start to tackle this issue in future (as written above, might be well into next year), it would be awesome @Schwefelholz if we could approach you to run debug and test versions as it will be very likely that neither I or @MJoergen will be able to reproduce this on our machines as our machines are also "known good HyperRAM machines" (in the above-mentioned sense).

sy2002 commented 10 months ago

And just in case @paich64 and @Schwefelholz can and want to invest the time right now so that we collect a bit more data already right now (but this is not mission critical but more nice to have so that we have a well-rounded data and analysis package in this issue): It would be great if you uploaded a short video here in this issue, showing the so called "HAQ": This Discord conversation has a download link to a special HyperRAM debug version of the C64 core and also a video that shows how to use it. Some short videos showing the HyperRAM Access Quotient (HAQ) on @Schwefelholz 's machine while experiencing the glitch in PETSCII Robots and @paich64 's machine while the glitch is not happening would be helpful so that in future, when we start to tackle the problem, we can use the HAQ as a starting point of our investigations: https://discord.com/channels/719326990221574164/794775503818588200/1087000778041991178

Schwefelholz commented 10 months ago

@sy2002 @paich64 Thanks for your efforts so far. For sure I'll be available for further testing once the time comes. I'll also try to provide some figures using the HyperRAM debug core mentioned above.

Schwefelholz commented 10 months ago

The HyperRAM debug core does not seem to work for me. I actually see numbers in idle state that are much higher than the ones shown by @sy2002 on Discord and they don't change too much under load, but I may be misinterpreting things here... https://drive.google.com/file/d/11WWwsmwHT_1NZ3OmaE9h9wiulAMFhWLd/view

paich64 commented 10 months ago

@Schwefelholz I can't comment on the values you see, but watching your video i can confirm that the glitches you're having between 1:11 and 1:18 are the ones a few guys had before @sy2002 and @MJoergen reworked REU implementation. I guess you don't have them when running V5final. I will re-test core VA11 and will post a video with the values i get.

sy2002 commented 10 months ago

@Schwefelholz and @paich64 : Thank you - for now, we have all info we need. Indeed I remember that @sho3string and other people on Discord who had the "bad HyperRAM" had a similar output like what @Schwefelholz has shown here, so my working hypothesis for now: The Attack of the PETSCII Robots glitch is a problem in our MiSTer2MEGA65 HyperRAM implementation: It does not work glitch-free on newer ("Revision D") HyperRAMs.

This working hypothesis is also bolstered by the fact that @paich64 cannot reproduce and he has a "known good HyperRAM".

So: Fixing this issue here will very likely fix issue https://github.com/MJoergen/C64MEGA65/issues/55 and vice versa.

Again disclaimer @Schwefelholz : Your HyperRAM is very likely to be 100% fine. "Good" and "bad" refer to our HyperRAM driver's ability to cope with certain chip revisions of the HyperRAM.

Schwefelholz commented 10 months ago

Understood. Thanks for looking into this!

sy2002 commented 8 months ago

@Schwefelholz and @paich64 : Can you gentlemen please re-test if the attached Alpha 3 version of the upcoming V5.1 core fixes the Attack of the PETSCII Robots graphics glitch? c64v51a3.zip

paich64 commented 8 months ago

@sy2002 Tested again with v5.1 alpha 3, i can't reproduce the glitch.

sy2002 commented 8 months ago

Awesome news @paich64 - THANK YOU :-) I will close this as fixed. @Schwefelholz you can use the core from the ZIP file in the message above (https://github.com/MJoergen/C64MEGA65/issues/86#issuecomment-1791530578) to play the game without a glitch (it is Alpha 3 for the upcoming V5.1 core release) - OR - you can wait until the official V5.1 will be released (probably Jan/Feb 2024).

Schwefelholz commented 8 months ago

@sy2002 I'm sorry, but I must say that the glitch is still there for me in 5.1A3. Took a little longer until it came around, but it is still there.

sy2002 commented 8 months ago

Thank you @Schwefelholz for testing - I will re-open the issue and @MJoergen: The 256 => 128 burst length did fix the TreuLove issues that muse had, but it did not fix the PETSCII Robots issue. So this seems to be a different thing.

@Schwefelholz: Just to make it easier for and @paich64 to reproduce in future: Can you share that "a little longer" means and what one needs to do to reproduce? (Assuming you play the REU version of the game). And: Is the effect still what is shown in the videos you uploaded: https://github.com/MJoergen/C64MEGA65/assets/6893444/e2453d88-2d08-4e9e-8e0f-d633aea79db7 https://github.com/MJoergen/C64MEGA65/assets/6893444/8772f52c-c53b-46fb-af83-2e49ec5d14f5

And @Schwefelholz : Do you by chance own a MiSTer? I would like to understand better, if the bug is related to our HyperRAM-based REU simulation on the MEGA65 or to the underlying MiSTer code that we ported to the MEGA65. Also @paich64 : I know you own a MiSTer. As soon as you can reproduce it, too, on the MEGA65 using 5.1A3, you might double-check on MiSTer.

I feel a next step here includes contacting the Author of "Attack of the PETSCII" Robots and asking him for a bit of guidance what his code is doing so that we understand timing sensitivities or other things better so that we can understand better what a real C64, or the Ultimate is doing better than us. We will need to kind of "single-step-debug" this issue on a code level.

Schwefelholz commented 8 months ago

"A little longer" just means maybe a minute or so. Actually, I made exactly the same door shown in my video open/close a couple of times until the issue became apparent again. This also answers your second question: Yes, the effect is exactly the same as before.

Unfortunately, I do not own a mister. I do own a SiDi, but I don't know how close related its C64 core might be to the mister one. I can give it a try, though...

sy2002 commented 8 months ago

@Schwefelholz Thank you for the fast feedback. @paich64 Would be cool if you could come back with feedback (see above).

paich64 commented 8 months ago

@sy2002 ok will test again on my Mega65 and on my Mister to see if i can reproduce or not

Schwefelholz commented 8 months ago

Just tried on my SiDi and the glitch doesn't seem to appear there (however helpful this might be).

Also, I tried again on the MEGA65 and almost thought I had to revoke my comments above. It took quite some time now (maybe five minutes) constantly opening/closing the door until an issue appeared, which even looked a bit different than what I saw before. I've attached a new video.

https://github.com/MJoergen/C64MEGA65/assets/6893444/a2edf517-d934-4934-9b6f-d0596f231d12

sy2002 commented 8 months ago

Thank you, valuable feedback @Schwefelholz . Since SiDi is a MiST clone with a widely different code base than MiSTer, I am still eager to learn more from @paich64 about the MiSTer - which is the code base we use.

sy2002 commented 8 months ago

@Schwefelholz : Independently if you use the HDMI or the VGA connector, the core is always generating a HDMI signal, which utilizes HyperRAM bandwidth to do scaling and stuff. The REU simulation also uses HyperRAM since HyperRAM has a super high latency it might be that REU and HDMI ("ASCAL") clash. Can you tell me: Which video mode is the HDMI set to? 720p @ 50 Hz or 60 Hz? Or 576p ? In case it is set to 720p can you try if 576p still reproduces the issue?

====

Technical background info / hypothesis, in case you are interested in this kind of stuff - but as said: Just an hypothesis. Might be completely wrong. But the test above with different HDMI resolutions (as said, independent if you use HDMI at all) would be helpful

The (minimum) HyperRAM bandwidth required by the ascaler can be calculated as follows:

The ascal'er first writes the entire frame (at the core frame rate), then reads the entire frame (at the HDMI output frame rate).

In the case of the democore, the frame size is 720x576x3 bytes (24 bits per pixel), i.e. 1.24 MB. This must be written 50 times a second and read 60 times a second. I.e. a total of 110 times a second. The total memory bandwidth required is therefore: 1.24 MB (50 + 60) = 137 MB/s. The total available HyperRAM bandwidth is 200 MB/s. The utilization is therefore 137/2002 = 68%.

In practice there is some overhead (the ascaler actually stores a bit more data, and there is a transaction overhead with the HyperRAM). So in practice the HyperRAM is approx 75% busy in this configuration. This is an average amount, and the HyperRAM accesses can/will/do overlap occasionally. I'm guessing this explains the glitches.

Schwefelholz commented 8 months ago

@sy2002 I am indeed using HDMI. The monitor says it's on 576p@50 Hz.

At least, once the issue is visible, it does not make any difference switching to other HDMI modes or even VGA. Not sure if it would make sense to start the game directly in each video mode and see if the issue occurs independently...

sy2002 commented 8 months ago

@Schwefelholz Thank you - and no: No tests with restarting necessary, everything is in hardware, i.e. in real-time :-) Your feedback means that our hypothesis about HyperRAM bandwidth was probably wrong. As strange as this might sound: It is good news! Reason: We need to live with the HyperRAM and therefore it is good that the bandwidth is not the limiting factor in this case.

paich64 commented 7 months ago

@sy2002 20231108_224113 I have been trying to replicate on Mister using the very last core from May 2023, but i can't reproduce :( going to give it another try on the Mega65 c64 core.

paich64 commented 7 months ago

@Schwefelholz while i'm trying to reproduce, in order to ensure we are testing under the exact same conditions, would you mind making an image of your SD card using https://sourceforge.net/projects/win32diskimager/ , zip the .img file and upload it somewhere so that i can retrieve it, flash it to my own SD card and give it a try ? Alternatively, if i make an image of my own SD card and upload it somewhere, would you be ok to retrieve it, flash it to your own SD card (using Balena Etcher https://etcher.balena.io/ ) and try with it ?

paich64 commented 7 months ago

@sy2002 I have been doing my best to stress test the game on the opening/closing of the door and I'm sorry i just can't reproduce.

sy2002 commented 7 months ago

@paich64 I highly appreciate your efforts!

Schwefelholz commented 7 months ago

@sy2002 I have been doing my best to stress test the game on the opening/closing of the door and I'm sorry i just can't reproduce.

@paich64 Hmmm, this is indeed strange. Last time I tried yesterday, the issue was coming up almost immediately.

I'll see to get my SD card content to you. Hopefully, I'll remember to do so once I'm back home from work. I should also be able to try with your image, no problem.

paich64 commented 7 months ago

@sy2002 I have been doing my best to stress test the game on the opening/closing of the door and I'm sorry i just can't reproduce.

@paich64 Hmmm, this is indeed strange. Last time I tried yesterday, the issue was coming up almost immediately.

I'll see to get my SD card content to you. Hopefully, I'll remember to do so once I'm back home from work. I should also be able to try with your image, no problem.

Actually I've never had your issue. If you can generate and upload an image of your SD card i will be able to test with it.

Schwefelholz commented 7 months ago

@paich64 I can share my SD image with you now on Google Drive. Are you available on the MEGA65 Discord? Which user name? I wouldn't want to spread the link too widely.

paich64 commented 7 months ago

@Schwefelholz send me the link to olivier.simplelife@gmail.com

paich64 commented 7 months ago

Otherwise I'm AmokPhaze101 on the Mega65 discord ;)

Schwefelholz commented 7 months ago

@paich64 You've got a PM on Discord from Senfsosse.

paich64 commented 7 months ago

@sy2002 I have flashed @Schwefelholz image to a 16G sdcard and i can't reproduce after more than 20min opening/closing the door. So it's definitly not related to the SD card content unfortunately.

sy2002 commented 7 months ago

OK @paich64 : Maybe it is related to the Phase of the HyperRAM. Stay tuned.

MJoergen commented 5 months ago

@Schwefelholz Is the glitch visible on the VGA output as well ?

Schwefelholz commented 5 months ago

@MJoergen Latest testing on C64 MEGA65 5.1A11 does show the glitch on VGA as well. Will report back once I tested with A12.

Schwefelholz commented 5 months ago

@MJoergen "Alpha 12" does not work for me, as described in https://github.com/MJoergen/C64MEGA65/issues/127#issuecomment-1923565723

MJoergen commented 5 months ago

Going back to A11 (on the R5 board), here are some observations I've made, using extra debug signals. The picture below shows some of the signalling occurring in the M2M framework:

This is a total of 92 clock cycles @ 100 MHz, i.e. almost one microsecond. Breaking this time down into parts:

This gives a total of 92 clock cycles.

Perhaps some of these extra delays can be reduced, e.g. either the 4-clock-cycle delay at 4094-4097, or either of the 10-clock-cycle delays waiting for response.

image

MJoergen commented 5 months ago

Here is a similar diagram, this time in the Core clock domain:

Here we see a delay of 40 clock cycles @ 32 MHz, i.e. 1.25 microseconds.

image

MJoergen commented 5 months ago

@Schwefelholz Please try again with V14 of the core: C64MEGA65-V5.1A14.zip

Schwefelholz commented 5 months ago

@MJoergen Alpha 14 shows the same behavior as Alpha 10 or Alpha 12. No obvious difference.

MJoergen commented 5 months ago

@Schwefelholz Thank you for helping out by trying all my various (failed) attempts. I will now go into a deep think ....

MJoergen commented 4 months ago

@Schwefelholz I've made a new attempt. Will you please try it, and report back here. C64MEGA65-V5.1A15.zip