hrydgard / ppsspp

A PSP emulator for Android, Windows, Mac and Linux, written in C++. Want to contribute? Join us on Discord at https://discord.gg/5NJB6dD or just send pull requests / issues. For discussion use the forums at forums.ppsspp.org.
https://www.ppsspp.org
Other
10.8k stars 2.12k forks source link

Hardware Transform can crash PPSSPP on Ouya in some game events #3574

Closed g1t-dlanor closed 10 years ago

g1t-dlanor commented 10 years ago

I've tested "Class of Heroes" on the Ouya, using PPSSPP v0.9.1 for Android in several beta builds up to "ppsspp-v0.9.1-408-gc05a2b6-android.apk" All builds that I've tested have one problem in common, causing crash out to Ouya's main menu at certain unusual graphic events.

The most common occurrences are when the player party arrives at a quest objective in a dungeon, meeting a special foe or other NPC. (special graphic events unlike what's normal in the dungeons)

It can also occur directly after using a torch or light spell. (this modifies the visual range, and thus affects much video rendering).

A workaround for the problem is to deactivate the game setting "Hardware Transform". This makes the emulation slower but seemingly safe from this type of crash.

Best regards: dlanor

xsacha commented 10 years ago

Hi. Can you check if this still happens on the latest version: v0.9.5-89+

g1t-dlanor commented 10 years ago

I just saw your request for re-testing this, and will do so ASAP. I'll download the latest buildbot beta you mentioned and test this on my OUYA right away. I'll be back here with my results shortly.

However, these results may not be 100% certain, as the original bug was not 100% consistent in striking... But I'll do my best to repeat the same kind of things that used to trigger the bug.

xsacha commented 10 years ago

I realise the bug was random in nature. This matches up with the bug was fixed. Hope it is.

g1t-dlanor commented 10 years ago

Unfortunately this bug (or a newborn close relative) is still present in the emulator, though it struck in slightly different scenes than the ones where I previously saw it strike. (Some scenes are hard to revisit the same way.)

I did manage to make a savestate 'around' which I can persistently trigger the bug, and thus also verify that this bug never strikes with the 'Hardware Transform' setting turned off, just like in my earlier test series.

This reliable bug trigger position is with my characters standing in front of a warp gate leading to the deep temple of one of the larger dungeons (Yamhaus). If I warp through that gate to the deep temple with hardware transform on, then the emulator crashes nearly every time. Somehow I got through once, never repeated, and even then the emulator still crashed as I tried to open the in-game menu after arriving in the temple (I wanted to add another savestate for this unusual result). I've also tested warping through the gate with that transform setting off, which always works fine, but if I then set it on after arriving in the temple, then the emulator still crashes when I exit from the in-game menu back to the game (so instead I drop out to the OUYA main menu, as always for these crashes).

Note that all of the tests made today were done with my character team having activated a magic torch ('Lumigan' spell), which was proven to make triggering of the bug easier in my earlier test series as well, probably because it modifies the 3D rendering of distant objects (possibly increasing the amount of objects visibly rendered).

If you wish I can transfer the savestate I made to a PC and upload it somewhere for your own experiments, though you'll need an OUYA to test the bug in the same environment that I have.

g1t-dlanor commented 10 years ago

Oooops! I accidentally closed the issue temporarily, when I meant to just close the ongoing comment. Clicking the 'Comment' button didn't work at all for some reason, so I tried the other one which did post my text, but also closed the issue, which I never intended. I hope this doesn't mess things up for any bug tracking...

xsacha commented 10 years ago

Could I have the save game? I don't have a Ouya but this bug is likely for all ARM devices.

g1t-dlanor commented 10 years ago

Sorry for the delay, but I've been a bit busy with other things today. I'll extract the savestate from the OUYA and prepare a ZIP for upload somewhere. I do have a small share site provided by my ISP which I can use for this. I'll be back here a bit later with a download link for that ZIP.

g1t-dlanor commented 10 years ago

OK, I've now prepared a shared folder on the site I mentioned in my last post here, and in it I've placed the ZIP containing the savestate I mentioned. Note that it's for savestate slot 2, so you need to select that before loading it in-game.

The following link is to the shared folder I prepared for PPSSPP stuff, so the same link may be reused later if I have any further feedback files for you.

https://secure1.storegate.com/Shares/Home.aspx?ShareID=bea69de4-d62b-4434-bbcf-1ca84c4d0c20

However, I suspect this bug may be more OUYA-specific than you thought, as I've tested this savestate without any crashing on another Android unit I have which also has an ARM CPU (It's a DMTech == PIPO Tablet). Still, even if you can't test this savestate on an OUYA yourself, perhaps you can get some other PPSSPP coder to help you with that, and he may then be able to give you more relevant coding feedback than I can do.

It could be a GPU problem, rather than a CPU problem, which makes sense when it's triggered by actions that cause differing graphics rendering. (Both the hardware transform setting, and the in-game lighting of torches etc.) The OUYA only has a Tegra3 (a much criticized design point), and I don't know for sure what my tablet has. None of the SysInfo apps I have display any GPU info for it.

thedax commented 10 years ago

Has this improved at all? Is it still crashing during these events?

g1t-dlanor commented 10 years ago

It still crashes this way in all versions I've tested, the last such version being v0.9.5-1021. I know there have been some additional builds since then, but I've not yet tested those.

However, I may need to generate new testing savestates, as the one I've used is now showing another symptom too. Directly on loading that savestate I get a weird beeping sound which is not normal for the game itself.

unknownbrackets commented 10 years ago

I don't have this game, but some questions that might be interesting if answered:

-[Unknown]

g1t-dlanor commented 10 years ago

1: It never crashes on PC. 2: The same savestate does not crash on other platforms I've used (PC and my Android Tablet) 3: On the OUYA this game crashes in similar scenes even if no savestates were ever used 4: It does not crash when loaded on my Pipo Tablet (Android 4.1.1) 5: Disabling the vertex cache DOES help !!!

I tried several times now on the OUYA, and it only crashed when vertex cache was enabled.

Btw: The crash with this savestate does not occur directly when loading the savestate. But when the RPG character group is moved one step forward to enter a teleport gate, and the player confirms the gating, then the picture shifts briefly to an image of the local scenery, after which the program crashes (to OUYA menu).

unknownbrackets commented 10 years ago

There were a few bugs fixed, that maybe could have affected this - some games were specifying texture coordinates in a way we didn't support. Usually this meant old memory was used, but maybe the Ouya driver was less happy about our mistakes.

Does this still happen with the latest git (and vertex cache on)?

-[Unknown]

g1t-dlanor commented 10 years ago

I've now downloaded and tested the latest build v0.9.6-691, and you may be right as I was able to use the old savestate and successfully teleport to the deeper dungeon of Yamhaus without crashing, even with vertex cache on. But an old savestate like that can't be fully relied on for validating a bug cure, especially as the game did crash in a different way shortly after the teleport.

I restarted and reloaded a proper gamesave, and entered the top level of Yamhaus, intending to generate a new savestate for testing this problem, but then I discovered that this is impossible due to a new bug that crashes the game before I am able to get far enough into the dungeon.

This new bug strikes even if both 'Vertex Cache' and 'Hardware Transform' are off, and when it strikes the display and all other aspects of the emulator just freeze. The only way out of this freeze is to double-click the OUYA button and force terminate the emulator with the 'Exit' command in the OUYA's popup menu.

The trigger for that bug appears to be the exiting from the main menu after using any of its submenus while inside a dungeon. Above ground the menus seem to work normally, but in a dungeon I just have to open the main menu, open a submenu (like 'Items') then close that submenu (here the main menu still works) and then the main menu, and then everything freezes.

This means that I am unable to get deep enough into the dungeon to test the older bug properly, since I'd have to use the menu system many times on the way there.

Now what ? Should I open a new issue for this new bug, or just leave the info here, since both these bugs appear to be specific to this one game, and may have causes in common ???


Additional tests made on a Win7 PC show that this new bug is not OUYA-specific. I get exactly the same results on the PC, and force terminating the emulator is still the only way out of the freeze, though in Windows I can do it just by pressing Alt-F4 to close PPSSPP.

In any case, since this new bug is not platform-specific anyone with access to the 'Class of Heroes' game should be able to test this bug. Like I said above, triggering the bug is very simple:

1: Send a team to explore a dungeon 2: Open the main action menu 3: Open a submenu. like 'Item' 4: Close the submenu 5: Close the main menu 6: Observe the freeze

KingPepper commented 10 years ago

@g1t-dlanor, Could you create a Log, on the PC version, maybe with that the developers may see what's causing the freezes.

g1t-dlanor commented 10 years ago

I have added two log files to the same shared folder that I linked to in an earlier post, to provide others with the savestate I used for testing the old bug.

To simplify things I'm repeating that URL here: https://secure1.storegate.com/Shares/Home.aspx?ShareID=bea69de4-d62b-4434-bbcf-1ca84c4d0c20

The new files are named: Frozen_DebugLog_ppsspplog.zip Frozen_InfoLog_ppsspplog.zip

These logs were created in two sessions by launching DebugLog.bat and InfoLog.bat respectively, and then proceeding in identical manner for each session.

1: In PPSSPP main screen I launched 'Class of Heroes' from the recent games list 2: In the game I pressed start a few times to reach the point where I could load a proper gamesave, which I did. 3: I then used the game's 'Labyrinth' menu to let my team enter the Yamhaus dungeon 4: On arrival in the dungeon I opened the main menu and its submenu for 'Items' and then immediately closed each of the two menus, at which the game and emulator froze, becoming completely unresponsive. 5: I used Alt-F4 to terminate PPSSPP, and renamed the current ppsspplog.txt file according to which kind of log it contained, and then packed it into a ZIP for upload.

Hopefully someone else can find something useful in these logs, because I certainly can't... :(

unknownbrackets commented 10 years ago

I wonder if it's from the ffmpeg update, actually. A comparative log from a version of ppsspp for Windows that works would probably help.

If you could narrow down to the first version that works and last that doesn't, it would help immensely. Unfortunately, I don't have the game - but with it, you can run let's say v0.9.6-2 and v0.9.6-500. If let's say -2 works and -500 doesn't, try -251. If that works also, then you only have to test 251-500, so next is 376. In this way, you can test 1000 changes in about 10 tries (2000 in 11, etc.)

Anyway, from the debug log I see:

22:01:703 user_main    D[KERNEL]: HLE\sceKernelSemaphore.cpp:398 0=sceKernelWaitSema(285, 1, 0)
22:01:703 TaskThread   D[KERNEL]: HLE\sceKernelThread.cpp:3293 Context switch: user_main -> TaskThread (276->308, pc: 088be04c->088bdf0c, sema waited) +12us

This is the last thing user_main ever does. Time of death, XX:22:01.703.

It seems to be TaskThread's job to wake user_main, and it doesn't do its job there. It's still alive, but it goes to wait on a mutex first:

22:01:704 FMOD SAS upd D[KERNEL]: HLE\sceKernelThread.cpp:3293 Context switch: TaskThread -> FMOD SAS update/mix thread (308->296, pc: 08913354->0896b208, mutex waited) +6816us

That mutex is FMOD stream thread's job (I like the clear separation of roles in this code, good developer.) That thread stops doing its job right here:

22:01:599 FMOD stream  D[ME]: HLE\sceAtrac.cpp:900 sceAtracGetRemainFrame(1, 0bf9f45c)
22:01:599 FMOD stream  I[ME]: HW\MediaEngine.cpp:84 FF: GHA Phase shifting
22:01:599 FMOD stream  I[ME]: HW\MediaEngine.cpp:84 FF:  is not implemented. Update your FFmpeg version to the newest one from Git. If the problem still occurs, it means that your file has a feature which has not been implemented.
22:01:599 FMOD stream  E[ME]: HLE\sceAtrac.cpp:608 avcodec_decode_audio4: Error decoding audio -1163346256
22:01:599 FMOD stream  D[ME]: HLE\sceAtrac.cpp:683 80630024=sceAtracDecodeData(1, 09740780, 0bf9f454, 0bf9f458, 0bf9f45c)

It's probably trying to decode garbage, it's unlikely that "GHA Phase shifting" is the real problem here.

-[Unknown]

g1t-dlanor commented 10 years ago

----- re: new logs with non-freezing PPSSPP version I will do that a bit later tonight and post the new logs in the same place as before. Unfortunately it will not be possible to end the logs in a moment corresponding exactly to the freezing point in the old logs. The closest I can get is to perform the test exactly like before, and at the time when the freezing would occur in build 691 I have to make one more action (like opening the RPG's menu again) so as to verify that the game still responds before I terminate the run with Alt-F4 as in the last test series.

----- re: Pinpoint by interval halving I'm well familiar with the method, though the original purpose I was trained to use it for had nothing to do with computers. (Fine-tuning weapon sights in the Swedish Navy, some 40 years ago...)

I'll use that method in an attempt to pinpoint the precise version where this freezing began.

----- re: the log details for the freezing thread If I understand you correctly here, the main suspect for causing the freeze would be part of the new Atrac decoder, though it remains uncertain whether it was really fed garbage by the game or if it somehow failed to deal correctly with proper data generated by the game. But even if fed garbage data it should be modified so as to terminate the processing of it without hanging.

Like I said above, I'll start the new testing a bit later tonight (in a few hours), and when I have some useful results (probably an hour or so after starting) I'll post here again, after uploading the new logs to my sharing site.

unknownbrackets commented 10 years ago

Thanks. Garbage becomes a complicated point: if Sony's sceAtrac library returns an error code for garbage, and we don't, it could break games. Games may depend on receiving this error to properly fill music data.

If there is garbage data causing this problem, most likely it is due to flaws in our emulation of sceAtrac. It's unlikely that this specific game actually sends garbage when run on a PSP, especially since it doesn't seem to handle the error well.

But yes, it could definitely be a bug in the decoder misinterpreting the atrac stream.

-[Unknown]

g1t-dlanor commented 10 years ago

I have now uploaded four new log files to my sharing site (same as linked above). https://secure1.storegate.com/Shares/Home.aspx?ShareID=bea69de4-d62b-4434-bbcf-1ca84c4d0c20

For the latest PC build that has NOT got the freezing bug: -- non-freezing_build-594_DebugLog.zip -- non-freezing_build-594_InfoLog.zip

For the oldest PC build that DOES have the freezing bug: -- new-freezing_build-600_DebugLog.zip -- new-freezing_build-600_InfoLog.zip

These logs and all other tests in this pinpointing series were made using a similar procedure for each case, but with slight variations in how the bug triggering was done. For some cases I found that just entering and exiting the item menu was not enough, so I had to try some additional menu operations, or even cast a magic spell to make sure to trigger the bug in some cases. Rest assured however, that I did all the same things and far worse in those PPSSPP builds that I found myself unable to trigger the bug for, so those really are certain to be free of it.

You will find that these logs are shorter than the old ones, as the new test procedure used a savestate made inside the Yamhaus dungeon, so these logs don't include the steps needed to get there.

I do not have access to any PC builds between 594 and 600, but I did test the Android builds 598 (no freezing) and 600 (same freezing as on PC). And judging by the comments on the buildbot site I conclude that the most likely cause of the problem is the update of ffmpeg libs listed for build 599 (unfortunately without any binary to test), though it could of course be some other change not included in the comments.

But unless those changes included some to the Atrac handling, I think our earlier guesses were misdirected...

I'm beginning to think that it's an ffmpeg bug that triggers the Atrac lockup somehow.

unknownbrackets commented 10 years ago

I wonder if this is related to #4250.

It seems like the exact same problem is happening in another game: #5286. Since it seems to be the same, let's track this problem there.

I guess we can't fully rule this bug (hardware transform crash) out until that is fixed, but at least it's a pretty good chance it is.

-[Unknown]

g1t-dlanor commented 10 years ago

----- re: relation to #4250 It seems like a similar problem, but obviously not with identical causes, since that problem was reported as fixed, while my freezing continues in all recent versions.

----- re: similar problem in #5286 It does seem to have related causes and effects, but it can not be identical, since the fix mentioned at the current end of that thread apparently changed that problem from a hangup to a mere stutter, which does not happen for my test cases (tested with build 706). I still get the same full freezing as before, not just of the game but of all emulator responses, requiring Alt-F4 to force-terminate the application.

I'm reluctant to move the discussion of a still fatal bug to another issue regarded as partially fixed. Perhaps it is time to open a new separate issue for this bug.

----- re: OUYA-specific hardware transform crash That crash does seem to be fixed by recent changes, as tested by using the old savestate with the new build 706 and teleporting to the inner temple of the Yamhaus labyrinth. Before some recent changes this would crash out to OUYA's main menu, but now it works fine allowing the party to arrive in the temple area. However, as soon as an enemy attacks or just after some menu operations in the dungeon, the new freezing bug strikes, making the game unplayable.

unknownbrackets commented 10 years ago

Okay. With the patch, it should change from this:

22:01:599 FMOD stream  I[ME]: HW\MediaEngine.cpp:84 FF: GHA Phase shifting
22:01:599 FMOD stream  I[ME]: HW\MediaEngine.cpp:84 FF:  is not implemented. Update your FFmpeg version to the newest one from Git. If the problem still occurs, it means that your file has a feature which has not been implemented.
22:01:599 FMOD stream  E[ME]: HLE\sceAtrac.cpp:608 avcodec_decode_audio4: Error decoding audio -1163346256
22:01:599 FMOD stream  D[ME]: HLE\sceAtrac.cpp:683 80630024=sceAtracDecodeData(1, 09740780, 0bf9f454, 0bf9f458, 0bf9f45c)

And not return 80630024. At least for the other game, this 80630024 is why it was locking up (from the log, and proven by the change.) From the logs, it seemed quite likely that Class of Heroes would be locking up for the same reason.

If it's still hanging with that patch (which is in the latest git build now, btw), what does the log look like now?

-[Unknown]

g1t-dlanor commented 10 years ago

I have just tested this bug again with the latest build available from the buildbot, currently build 713. With that build I get exactly the same freezing bug response as before amd with similar log strings. I did not see that '80630024' constant though. Nor do I get exactly the same error messages you quoted. A similar debuglog section near the end of the file looks like this here:

23:03:183 FMOD stream  I[ME]: HW\MediaEngine.cpp:84 FF: GHA Phase shifting
23:03:183 FMOD stream  I[ME]: HW\MediaEngine.cpp:84 FF:  is not implemented. Update your FFmpeg version to the newest one from Git. If the problem still occurs, it means that your file has a feature which has not been implemented.
23:03:183 FMOD stream  E[ME]: HLE\sceAtrac.cpp:608 Unsupported feature in ATRAC audio.
23:03:183 FMOD stream  D[ME]: HLE\sceAtrac.cpp:686 00000000=sceAtracDecodeData(1, 09740780, 0bf9f454, 0bf9f458, 0bf9f45c)
23:03:183 idle0        D[KERNEL]: HLE\sceKernelThread.cpp:3207 Context switch: FMOD stream thread -> idle0 (305->272, pc: 089140f0->08000000, atrac decode data) +1us

You can check the new log files yourself at the same download site as before.

My current conclusion is that the unimplemented FFmpeg feature causes the ATRAC decoder to go nuts, in slightly different ways depending on what the game is doing.

Btw: I'm not really sure if build 713 contains the patch you spoke of. Unfortunately I am unable to compile PPSSPP sources downloaded from github myself, since the PPSSPP dev environment requirements are in conflict with stuff already installed on my main PC. So if you want me to test some new patch then it must be available on the buildbot site or be provided to me by you.

unknownbrackets commented 10 years ago

"Unsupported feature in ATRAC audio." is from my patch, so that means you have it. And indeed, it's returning 0 now (success.)

It seems like the problem here is that the audio contains GHA phase shifting either in a lot of places, or at the end. One way or another, skipping that frame doesn't work - there's either only more bad frames, or no more frames. That or maybe I skipped incorrectly in some case I'm not seeing...

Anyway, this is likely to be improved with that feature implemented in ffmpeg. Sounds like @maximumspatium is looking into it (per the issue about the problem.)

Hmm, I didn't realize our build requirements were a problem for anyone. Anything specific that causes a problem?

-[Unknown]

g1t-dlanor commented 10 years ago

Since this type of freezing appeared with an update of ffmpeg I agree that it should be fixable there. Just let me know if/when you want me to test this problem with some future PPSSPP build.

As to the build requirements, I don't recall exactly what error messages I got, but I was unable to install/update the Visual Studio compilers to the versions required for PPSSPP. And with wrong compiler versions I won't even try it, as the results can't be relied upon then. (Works fine to compile PCSX2 though...)

unknownbrackets commented 10 years ago

Right now, you need to have Visual Studio 2010 installed (Express is fine.) I think SP1 is required. Pretty sure that's about it for 32-bit, as we ship a mini directx dependency. You can also upgrade the project in newer Visual Studio versions (such as 2012 and 2013) and then compile there without 2010 installed at all.

For 64-bit, I think you need to install either a Windows SDK (platform SDK they used to call it) or DirectX SDK, I don't remember. I have both installed.

For best results, have git on your path or in the standard place Git for Windows installs it. This isn't required, but allows the build id to be automatically updated.

-[Unknown]

g1t-dlanor commented 10 years ago

I do have Visual Studio Express installed, both the 2008 and the 2010 versions, that's what I've used for PCSX2. (Though I can't build all subprojects with this, as some stuff REQUIRES a pro version. But I can recompile the main program, when I have prebuilt binaries for all the libs.) I also have both the platform SDK and the DirectX SDK on my main PC, which is running Win7pro_x64.

I tried to follow the precise instructions for the PPSSPP dev environment, as given on the home site, but this attempt broke down at some step where the compiler components had to be updated. That update just errored out as they were in conflict with already installed components. And uninstalling my current dev environments to start over from scratch is out of the question.

I've given up on that approach, though I plan to try again some time, using a 'clean' VMware machine to do it with. That way there can be no conflicts with existing setups.

unknownbrackets commented 10 years ago

If you install Visual Studio 2013 Express for Windows Desktop, you should pretty much be all set with the latest version of PPSSPP. Everything else is included in submodules, so it should be easy to compile.

Anyway, it sounds like this Ouya bug is fixed. If there's a new bug we should probably create a new issue. This game is using GHA phase shifting, per the forum, so the other problems definitely seem related to #5286.

-[Unknown]