Closed jdgleaver closed 3 years ago
Also happens with linux standalone version.
FYI still trying to figure this one out. I see it, but it's rather difficult to reproduce reliably. I can only do postmortem analysis, which hasn't led to insights pointing to the real problem. Currently this is the showstopper for 1.97 :-/
Thank you for the update - I really appreciate you taking the time to look into this.
Damn shame that it's proving so elusive - this kind of thing is the very worst sort of bug... I wish I knew how to help... :(
I think it's a synchronisation issue. Both 68K CPUs wait for something in the Sega CD gate array registers. Only thing is, AFAIR I haven't changed the 68K poll detection mechanism, and I also haven't changed anything in FAME. Concerning the gate array I have only changed some CD related things, which isn't polled for by the CPUs.
Is this really a new issue, or has it occured with 1.93 as well, possibly more infrequent?
I've got a notion what's possibly happening. Here's a kludge that might help. Could you try this patch and report if this is better:
Apologies for not replying sooner (I've had one heck of a day...)
I went back to notaz's upstream repo, and the exact same issue happens there as well. I've attempted to bisect it, and as far as I can tell, this is the first commit where the hang started occurring: https://github.com/notaz/picodrive/commit/12f23dac6f91eb707f985ef00a5d48e9e5ef8838
Anyway, I'll test your patch right away...
Okay, this is great - with your patch applied, I can no longer reproduce any hangs in Silpheed. I played through the first level (and into the second) 5 times with zero issues - without the patch it was hanging consistently within the first 30 seconds or so.
ok, that sort of verifies my findings. Thank you very much for your help, especially with the bisecting. I believe I have now understood what the root cause is, indeed a synchronisation problem between the CPUs. I have to dig a bit deeper into the pico/cd code to learn how to fix this in the right way.
Ah, this is excellent news! Thank you so much for your continued efforts. Your expertise is simply invaluable.
I do hope the 'correct' fix doesn't turn out to be too onerous and time consuming. Good luck!
Just to keep you in the loop, I've tracked this one down to an unfortunate combination of a polling loop interrupted by a vertical interrupt, leaving the other cpu enough cycles to execute a change of the polled value at the wrong time. My kludge still holds, and by my findings I verified that 16 main cpu cycles (or 24 sub cpu cycles, since it's clocked about 50% faster) is apparently the minimum needed. I'm however not exactly satisfied - it's really a mysterious coincidence that the VINT hits exactly in that time window. I'm currently communicating with @notaz about this issue. If nothing new comes up there, I'm going to commit a kludge to get v1.97 out of the door. It has been sitting there far too long.
Thank you for the update! That does indeed sound like a fishy coincidence...
But even if nothing else develops, your kludge is thoroughly researched and tested. You may be unsatisfied, but I have great faith in you - I know you will do whatever is best for the codebase :)
A small, related question: I know the most important thing is the v1.97 release, and that must take priority - so please don't think about us until that's all wrapped up. But did you have any further ideas on how we should go about updating the libretro repo with your fixes, including this one? Anything you want to do is fine with us, and I will help in any way I can. But again - this is a question for after the v1.97 release :)
Well, I'll stop further investigation here. I still believe this coincidence is strange to say the least, but I've been unable to find a reason for it. I'll say this is fixed for Silpheed. If you have other games where this was happening I would appreciate a check. I was having a conversation with notaz about this, and he said wolfteam games had this code sequence as well. If you find anything please reopen.
Regarding libretro, yes, I have... let me roll out this 1.97 over the weekend, then we should talk elsewhere. How about having a chat over this on discord? You can find me under the same name. BTW there's one last small thing before 1.97 which popped up in my conversation with notaz. I believe I can finish with that this weekend, so, 1.97 will be there RSN.
Excellent! Once again, I cannot thank you enough for your work on this. I truly appreciate all the time you have spent investigating this issue, and your fixes are perfect as far as I can tell :)
The only other game that I managed to hang more than once was Championship Soccer '94
(and that was a very rare event). I will test this game as much as I can, and if I do randomly encounter the problem again I will certainly let you know.
And yes, we can continue the libretro discussion on discord.
When running Sega CD content, Picodrive will hang randomly. The emulator does not crash - the menu can still be opened, and behaves normally - but while the game is running the display is blanked and the audio becomes either a flat tone or silence. This happens quite rarely, and seems to be associated with FMV playback (I have seen it a number of times during game intros). The issue only occurs reliably when playing
Silpheed (USA)
(Redump) - Picodrive will almost always hang at some point during the first level.I have observed this behaviour on two platforms:
OpenDingux (RG350M) with both the standalone version of Picodrive and the RetroArch core.
Linux (OpenSUSE Leap 15.2) using the RetroArch core.
When running the RetroArch core, nothing is output to the log or terminal when the issue occurs, and the hung content can be closed normally.
This a RetroArch save state taken while a hang is in progress: Silpheed (USA).zip
Please let me know if I can help in any way with debugging this issue.