randyrossi / bmc64

A bare metal Commodore 64 emulator for the Raspberry Pi with true 50hz/60hz smooth scrolling, low input latency and better audio/video sync.
GNU General Public License v3.0
473 stars 56 forks source link

Screen corruption with Super Snapshot #175

Open GaborC64 opened 3 years ago

GaborC64 commented 3 years ago

Characters on the screen occasionally turn into blocks during/after disk operations with the Super Snapshot cartridge image. Restore button or repeated disk operations may bring them back. The same SS .crt works fine in TheC64, Vice on a PC and Vice in RetroPie on a RPi-3B.

To replicate in BMC64 (on a RPi-3B and RPi-1Br1.2):

  1. Load Super Snapshot 5.22 v2 (.crt) (https://rr.pokefinder.org/wiki/Super_Snapshot)
  2. Load any disk image (.d64)
  3. Press $ or @ repeatedly (disk operations)

I've tried different combinations of REU on/off, drive models, True Emulation, IEC Filesystem, HDMI settings, in BMC64-3.6 and 05/05/21 master build, but nothing helps.

Screenshot: https://postimg.cc/VdxbBJcF

rhester72 commented 3 years ago

It's pretty likely this is a bug in the underlying VICE core. Have you tried searching the VICE issues log to see if it was previously reported/fixed? Backporting a fix may not require a lot of effort if it can be identified.

GaborC64 commented 3 years ago

I cannot find any Vice issues describing the same problem. The fact that the bug is only on BMC64 and not on Windows Vice or TheC64 suggests that this is a BMC64 specific issue. The latest 3.7 has not fixed the bug.

randyrossi commented 3 years ago

I can reproduce this in Vice 3.4 on my Linux desktop build. Sometimes the characters turn to solid blocks simply using $ for a dir listing. It's random.

(Make sure you are using x64 and not x64sc to compare)

It looks like a VICE problem to me.

On Mon, Jun 7, 2021 at 1:40 PM GaborC64 @.***> wrote:

I cannot find any Vice issues describing the same problem. The fact that the bug is only on BMC64 and not on Windows Vice or TheC64 suggests that this is a BMC64 specific issue. The latest 3.7 has not fixed the bug.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/randyrossi/bmc64/issues/175#issuecomment-856133891, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAI3HKCK4NSLJTKULRSRU6TTRUABTANCNFSM45ZX2TLA .

-- Randy Rossi

GaborC64 commented 3 years ago

I get it, this glitch manifests on BMC64 because it uses an older Vice version with a bug. The x64 is a good lead so I'll keep looking, but as it is deprecated, the chances are low for a bugfix. Any tips on where to look for the Vice bug would be most welcome.

GaborC64 commented 3 years ago

I've just reinstalled RetroPi 4.7.1 on the RPi-3B to check. After installing Vice from binary, it uses vice-x64 and doesn't have the glitch with Super Snapshot. Does this give us hope?

randyrossi commented 3 years ago

I checked the fetch of char pointers and they seem to be fine even when the blocks are shown. Could be in the chargen lookup or maybe drawing. But it might be a lot of work to track down.

rhester72 commented 2 years ago

I finally got a chance to take a better look at this - even on 3.3 (the version of VICE BMC64 is currently based on), the cycle-exact emulator does not exhibit the problem, suggesting this cartridge requires cycle-exact to function properly (something BMC64 can never achieve). It doesn't appear to be a VICE bug as such, just one of those rare pieces of software that needs cycle-exact behavior.

Eddepad commented 2 years ago

Thanks for looking into this (I experienced the same glitch). Maybe sometime, a future Raspberry PI5 or 6 could have enough power to cycle-exact BMC64.

GaborC64 commented 2 years ago

Thanks for taking a look. The problem is gone in Vice 3.5 x64 (non-cycle-exact) installed from binary today inside RetroPie 4.7.1. The stable BMC64 is based on Vice 3.3, the master build is based on 3.4 (afaik). Are you planning to have a future BMC64 based on 3.5 x64 which should be free from this bug?

rhester72 commented 2 years ago

Are you sure RetroPie has x64 and not x64sc, just renamed or softlinked? I ask because x64 was removed entirely from the Makefile pre-3.5:

https://fossies.org/linux/misc/vice-3.5.tar.gz/vice-3.5/Makefile.am

and is no longer supported at all after 3.4.

Master is still VICE 3.3 - I took a swing at bringing it up to 3.4 about a year ago, but got a bit lost in the 64 filter code and managed to badly misinterpret new keyboard input parameters, so it stalled out for want of some of Randy's time because he's very focused on the VIC II replacement. The code still exists and was preserved, but if you're right and this was a post-3.4 change that made something work better, that wouldn't help much anyway.

It's VERY regrettable that pokefinder shut down the nightly WIndows builds, because that was the fastest way to narrow down very specific changes that altered behavior. Without those, it's down to hand-compiling build after build after build, and I no longer even have a working Windows mingw build environment (and haven't for years).

If anyone's able to a) confirm that a manual compile of x64 (it just needs to be added back to the Makefile) on the 3.5 release code base does indeed fix the problem, and b) is willing to then do several dozen builds and tests to isolate the exact commit that fixed it, I'd be delighted to integrate the required changes once identified into BMC64. I was going to go after pokefinder over the holiday break to see if I could figure it out that way, but natch...it is no more. :/

Compyx commented 2 years ago

VICE still provides development Windows builds, just not via pokefinder anymore, they're on our SVN mirror. Those snapshots are built after each commit, rather than once per day. They do however not contain the old x64 emulator.

As for Super Snapshot 5, I noticed some fixes in r39834, which might be the stuff you're after. I've checked current x64 (on Linux) with SS 5.22 and failed to trigger any screen corruption while repeatedly using '$\<enter>' or '@\<enter>', so you could look into applying those specific fixes in your code.

There are detailed instructions on setting up msys2 to build VICE for Windows here, should you want to build VICE yourself. Though in my experience on Windows it's actually faster to use a Linux VM to do multiple compiles to narrow down a specific revision that fixed or broke something since Windows is terrible at process spawning, something configure and make do a lot.

GaborC64 commented 2 years ago

(rhester72): Are you sure RetroPie has x64 and not x64sc, just renamed or softlinked?

RetroPie 4.7.1 seems to have x64. Having installed Vice in RetroPie from binary or source (these are menu options), /opt/retropie/emulators/vice/bin has x64, x64dtv, x64sc, x128, etc. and vice-x64 can be selected from a menu (see https://retropie.org.uk/docs/Commodore-64-VIC-20-PET/). The Help/About in Vice shows Version 3.5 rev 40565 SDL2 and it does not have the glitch with Super Snapshot.

Unfortunately I don't have the skills to work out exactly what makes the glitch go away in this Vice revision and then inject that into BMC64.

Compyx, thanks for the info, that might help someone who can build Vice, but again I'm just an amateur.

Merry Christmas!

rhester72 commented 2 years ago

@Compyx Good to see you - thank you! I was after older pokefinder snapshots specifically because I was hoping to identify the 'fixed build' before x64 was dumped, since it happened in the transition between 3.4 and 3.5.

Unfortunately, reverse-applying r39834 did not provide relief. It does seem to take longer to trigger in BMC64, but I think that's entirely coincidental.

Agree that Linux would be a better platform for debugging, but unfortunately all my Linux boxes are headless, so msys2 it is. I'll start with a full 3.5 release build of x64 to confirm the problem truly doesn't reproduce there (trust, then verify! ;), then work my way backwards on half-cuts of commits until I can at least get the timeframe, then go from there.

I was REALLY hoping for a Christmas miracle release of BMC64 tonight that contained the fix...sorry, @GaborC64!

rhester72 commented 2 years ago

x64 in 3.5 really is OK, so something got fixed somewhere. :)

I can't get 3.4 to build at all on msys2, things blow up royally with hvsc. @Compyx , any thoughts?

Compyx commented 2 years ago

You can run into the occasional revision that doesn't compile, but I'd expect a point release to compile at least. It could be it compiled with the msys2 packages that were current back then but no longer compiles with current msys2, it could also depend on the configure switches used.

Can you post the error message with some lines of context? And your configure line?

rhester72 commented 2 years ago

Agree with all your points...a bit confused by this one.

Once it starts compiling hvsc, there's about 10,000 lines like this, all complaining about the same function redef:

C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/11.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: ../src/arch/gtk3/libarch.a(uimedia.o):C:\msys64\home\rhest\vice-3.4\src\arch\gtk3/../../../src/arch/gtk3/widgets/base/carthelpers.h:40: multiple definition of `carthelpers_can_flush_func'; ../src/arch/gtk3/libarch.a(uicart.o):C:\msys64\home\rhest\vice-3.4\src\arch\gtk3/../../../src/arch/gtk3/widgets/base/carthelpers.h:40: first defined here

Configure is boilerplate:

./configure -C --enable-native-gtk3ui --disable-arch

Compyx commented 2 years ago

It's not actually the hvsc directory, but the linking stage, the 'hvsc' dir happens to be the last directory the compiler visits before make starts linking. After way too much time spent on trying to fix the build system in old revisions, I found the reason linking fails: GCC starting with 10.0 uses -fno-common, meaning any declaration of an object in a header file missing the extern keyword will result in multiple definitions, making linking fail. (see https://gcc.gnu.org/gcc-10/porting_to.html)

So instead of having to fix the missing extern keywords in various files, we have to pass -fcommon to GCC in the configure line:

./configure CFLAGS="-fcommon" --enable-native-gtk3ui --enable-x64

(you need --enable-x64 to actually build x64)

With that out of the way, I finally managed to build x64 from r37296 (3.4 point release), and indeed SS5 randomly freezes the machine. Even got it to freeze once while showing its boot screen :)

rhester72 commented 2 years ago

hahahaha I wish I'd looked at the configure script, I was hacking Makefiles by hand to get x64 ;)

Thanks much - this was the last piece of the puzzle I needed! Plan from here is pretty rote:

Maddeningly time-consuming, but the most logical approach I've got, and I'm annoyed enough from this by now that I'm going to see it through for the sheer satisfaction of doing so. laughs

Thanks again!

Compyx commented 2 years ago

You're welcome :)

And that approach is the same I do, "binary search" for a revision that either broke or fixed something.

I noticed some revisions will not accept --enable-x64, in that case you can omit it, that revision will still build x64 by default. So far r38000 appears to work correctly with SS 5.22, couldn't get it to glitch, but you may want to confirm this. r35500 definitely glitched/froze. So you might at least have a starting point saving you a few builds.

If you do find the revision that fixed SS5, keep in mind there might be previous commits involved in fixing the bug, so backporting a single commit might not fix the issue in BMC64. But it'll at least be a start.

GaborC64 commented 2 years ago

Thank you both, it's wonderful to see this Christmas miracle in the "making"! @rhester72 do let me know if I can help from the sideline with testing etc., I keep an eye on this thread/emails with anticipation.

rhester72 commented 2 years ago

I've gone as low as r37561 so far and it's fine, so the fix was shortly after 3.4 was released.

Continuing the hunt tomorrow...

Kugelblitz360 commented 2 years ago

Well, the release notes for 3.6 (!) also mentions:

rhester72 commented 2 years ago

@Kugelblitz360 I think that was https://sourceforge.net/p/vice-emu/code/39834/ that @Compyx referred to earlier. I've applied it internally (and will keep it), but it made no difference here.

GaborC64 commented 2 years ago

The glithcy blocks on the screen follow a nice numerically determined pattern as described at https://www.lemon64.com/forum/viewtopic.php?t=70498&start=1756. Once it's found, it would be interesting to see how the offending code causes such a regular pattern.

Compyx commented 2 years ago

Well, the release notes for 3.6 (!) also mentions:

* Super Snapshot V5 fixes.
  I went down the last eight weeks of commits but can't find it. Why does Sourceforge have no good search functionality?

Because SourceForge sucks. And there's a year of commits between 3.5 and 3.6, almost 2000 of them ;)

rhester72 commented 2 years ago

Closer. r37428 fails. Within 135 commits, but progress will be slow due to family demands on my time :)

GaborC64 commented 2 years ago

@rhester72 family is first.

The glitch is that certain characters are replaced by blocks and others by space, but this is only true on a fresh start (BMC64 3.8). After loading and playing Beamrider (.d64 disk image), the directory listing (F3 is the SS shortcut for $ [enter]) glitches to something else, showing parts of the program: IMG_1394 Pressing Restore brings back the original listing: IMG_1395

rhester72 commented 2 years ago

The range continues to narrow:

r37561 OK r37527 FAIL

rhester72 commented 2 years ago

r37552 OK r37548 FAIL

rhester72 commented 2 years ago

And given what we know about it working on the cycle-exact core, my danger money is that it's this commit:

[r37549] by gpz this commit adds the missing dummy writes to the RMW instructions on the non-sc 6502 core. this makes the core even more sc-like, the access patterns are now equal to the sc core (but not the timing within instructions). in case of weird problems, revert this commit and see if that helps.

There's also the matter that all the other commits between that one and the known-working one were entirely cosmetic :)

Unfortunately, I won't have time to look at integrating this until this afternoon.

rhester72 commented 2 years ago

OK, the 'forklift upgrade' approach (just replace every changed file starting with 6510core.c from revision 37549 until all the dependencies are satisfied) turned into the expected trainwreck, so this will have to be a little more nuanced.

I've got two big concerns:

Will keep plugging away at this over the next several days, but once I've got a valid build I wouldn't count on it being usable due to point 2 above. This may be an unsolvable problem for BMC64 after all.

GaborC64 commented 2 years ago

It may well be an insurmountable challenge but in parallel with the "forklift upgrade" approach we could look for the specific cause of the bug in the current BMC64 code. With a focused fix of the bug we shouldn't lose the ability to run BMC64 on Pi0/1. Here are some of my observations which might give a clue to an experienced Vice programmer where to look in the code. In fact @randyrossi's first response in June was to look into the cause along these lines.

I think the specific cause of the bug is the unintended/uncontrolled switching of the character definition pointer between ROM/RAM. With Beamrider loaded (see above), the glitch looks different from blocks/spaces and using the Super Snapshot Char Set Monitor (in the Freeze menu) it's clear that the bug points the character definition at $1000-$1FFF in the RAM. Indeed, I can overwrite $1000-$1007 and this changes how @ (the first character) looks on the glithed screen. In fact $1000 is the default pointer, but normally the character definition comes from the character ROM ($1000 in the RAM does not show character information just blocks of 64 $00s and $FFs alternating) and the VIC-II must be doing its magic switching between ROM/RAM when needed and I think this is what our bug messes up.

I spent some time now with this bug yesterday (very early today) so here are some further observations for completeness. The glitch is triggered by dollarsign[enter] (or its shortcut: F3), @[enter], Super Snapshot's Freeze, or leaving the freeze menu (7. Resume). The glitch is random, and often reversible, so when it happens it can be just for a flash and then the normal characters return. When the glitch sticks after 'dollarsign' or @ (illegible characters remain on the screen), what brings it back is either a next command which triggered the bug ($ or @), pressing Restore, or Commodore+Shift to toggle another chunk of the character set (upper/lower case).

Interestingly, I see the exact same glitch for a flash in Vice 3.5 x64 (RetroPie) when doing a cartridge Freeze and then again when Resuming, but there's no glitch at all during $ or @. I have seen the same glitch with the Action Replay cartridge (and possibly others), wait for it in the animated gif after LOADER: https://rr.pokefinder.org/wiki/Action_Replay. The page mentions software bugs so there may be some clues over there.

PS: This is now the longest thread in Issues with 32 posts! Thanks for the stubborn perseverance to all involved. PPS: I had to write out 'dollarsign' otherwise I get a linebreak and strange formatting... hehe, I'm cursed with these glitches.

Kugelblitz360 commented 2 years ago

Question: 3.6 still comes with X64, it is just not compiled by default. Did anybody try that one? https://vice-emu.sourceforge.io/vice_2.html#SEC6 says that is simply not in the binaries anymore but not that it is unsupported. 3.6 seems to have changes snapshot formats (again), I've seen reports that 3.5 snapshots do not load correctly - that might be a gamechanger. Maybe this should just be closed as an incompatiblity.

GaborC64 commented 2 years ago

Based on rhester72's earlier comment that BMC64 is still on Vice 3.3 and he couldn't bring it up to 3.4, I suspect that 3.6 would be too big a jump. My understanding is that he was going to identify the bad code and modify the Vice 3.3 underlying BMC64. @rhester72 , do you think it's worth exploring a BMC64 based on x64 in Vice 3.6?

rhester72 commented 2 years ago

Alright - I've been looking at this for WAY too long, so I'm going to summarize why I don't think this is viable based on considerable effort at this point:

In short, if this issue is to be fixed at all, it would probably be better served by rebasing on VICE 3.6 versus applying individual patches, but even then it will not be completely 'fixed' and come at a high cost of device compatibility. Given the wide audience BMC64 serves and the very few compatibility issues that remain (and are unlikely to be fully solved), I for one would classify this as 'wontfix' and leave it as-is, but being open source, anyone is welcome to pursue it independently. =)

GaborC64 commented 2 years ago

Many thanks for putting so much effort into this issue, it is very much appreciated. BMC64 continues to shine, it is a unique solution as it is.

Until a fix materialises somehow, the good news is that there are workarounds: the Super Snapshot glitch is gone in Vice 3.5 x64 as part of RetroPie 4.7.1 running both on RPi-3B and RPi-400. As we know (among other differences) this setup boots slower than BMC64 and needs to be shut down properly, but that's the price to pay for not having the glitch. I also found other cartridge images which can read the REU in BMC64 and don't have this glitch (e.g. Superfluid).

Thanks again rhester72 and others.