hrydgard / ppsspp

A PSP emulator for Android, Windows, Mac and Linux, written in C++. Want to contribute? Join us on Discord at https://discord.gg/5NJB6dD or just send pull requests / issues. For discussion use the forums at forums.ppsspp.org.
https://www.ppsspp.org
Other
11.45k stars 2.19k forks source link

(Libretro) Core crashing in RetroArch when using Fast-Forward with Vulkan video driver #13578

Open Ryunam opened 4 years ago

Ryunam commented 4 years ago

What happens?

I have recently updated the PPSSPP libretro core to the latest version, which includes some fixes that have made it possible to use the Vulkan video driver in RA.

It works well, however when using Vulkan and holding the button assigned to the fast-forward option in RA this core tends to crash. It is difficult to pinpoint exactly how long you need to fast-forward before encountering this crash, but I tried it with two games (Tactics Ogre and 3rd Birthday) and with both games the core crashed after I held fast-forward consecutively for 5-6 seconds at the menu screen.

This does not happen with any other core I've tested with Vulkan so far, so it seems to be specific to the libretro implementation of PPSSPP.

What should happen?

The PPSSPP libretro core should not crash when using Fast-forward with the Vulkan video driver.

What hardware, operating system, and PPSSPP version? On desktop, GPU matters for graphical issues.

OS: Windows 10 x64 (latest build) CPU: i7 4790K GPU: GTX 2070 Super Using latest RA compiled from master.

Ryunam commented 4 years ago

Closing this for now. Upon doing a more thorough analysis this seems to be related to RetroArch itself - and not the core. I was probably inadvertently hitting the Screenshot button while fast-forwarding and this led to discovering the problem at hand.

hrydgard commented 4 years ago

No, I do believe this is related to our kinda hacky Vulkan integration with Retroarch. aliaspider I think was gonna take a look.

Ryunam commented 4 years ago

The fact is that this is actually happening with all cores, not just PPSSPP. I opened an issue here describing exactly how to reproduce it and, as confirmed also by other users on Discord, this occurs with all cores that support Vulkan: https://github.com/libretro/RetroArch/issues/11483

I don’t know though if any modification was recently done in RA to make PPSSPP work with Vulkan that would lead to this global problem, but I doubt that is the case. I tried a version of RA from March and this was already happening back then...

Ryunam commented 3 years ago

I just wanted to share a quick update regarding this. I stand corrected: the issue that I had talked about (taking a screenshot while fast-forwarding with Vulkan) is indeed happening with all libretro cores on RA. However - unrelated to that - there is definitely a problem with some sudden crashes occurring when simply using the Vulkan video driver in general with the libretro version of PPSSPP.

Some games specifically ("Tactics Ogre Let Us Cling Together" and "The 3rd Birthday", to name a few that I've tested) will trigger a crash after simply using Fast-forward for a little while on the intro movie.

bslenul commented 3 years ago

Reposting from my duplicate:

While making the 2 videos for https://github.com/hrydgard/ppsspp/issues/13115#issuecomment-890869162 , I've noticed that RetroArch crashes a lot while fast-forwarding when using Vulkan (I had to switch to GLcore), it can happen after a second, sometimes after 30 seconds, it's a bit random. Seems to only happen with Vulkan, I couldn't trigger the crash with GLcore and D3D11.

This is what I get in the logs: [libretro ERROR] [SYSTEM] (../Common/GPU/Vulkan/VulkanRenderManager.cpp:VulkanRenderManager::Submit:1258) Critical: [false] Lost the Vulkan device in vkQueueSubmit! If this happens again, switch Graphics Backend away from Vulkan

And the crash log:

retroarch_drmingw.exe caused a Breakpoint at location 00007FFDFB2F666D in module ppsspp_libretro.dll.

AddrPC           Params
00007FFDFB2F666D 00000222ED598E18 00000222E4B9DCB0 0000022200000400  ppsspp_libretro.dll!retro_load_game_special
00007FFDFB2F5E80 00000000000003E0 0000000000000301 0000000000000002  ppsspp_libretro.dll!retro_load_game_special
00007FFDFB2F6820 00000222DC4E9300 0000000000000000 0000000000000000  ppsspp_libretro.dll!retro_load_game_special
00007FFDFB2F0E3F 00000222E3CE0150 0000000000000000 0000000000000000  ppsspp_libretro.dll!retro_load_game_special
00007FFDFBA5D4F4 0000000000000000 0000000000000000 0000000000000000  ppsspp_libretro.dll!retro_set_video_refresh
00007FFE3DDD7034 0000000000000000 0000000000000000 0000000000000000  KERNEL32.DLL!BaseThreadInitThunk
00007FFE3E4C2651 0000000000000000 0000000000000000 0000000000000000  ntdll.dll!RtlUserThreadStart
bslenul commented 3 years ago

After some advices from @fr500 I built the core with DEBUG=1 and re-generated a crash log:

retroarch_drmingw.exe caused a Breakpoint at location 00007FFDB81C90E7 in module ppsspp_libretro.dll.

AddrPC           Params
00007FFDB81C90E7 0000014FB3E1B4F8 0000000000000001 0000014FC7E0BE01  ppsspp_libretro.dll!VulkanRenderManager::Submit  [G:\msys64\home\B-S\ppsspp\Common\GPU\Vulkan\VulkanRenderManager.cpp @ 1258]
  1256: res = vkQueueSubmit(vulkan_->GetGraphicsQueue(), 1, &submit_info, triggerFrameFence ? frameData.fence : frameData.readbackFence);
  1257: if (res == VK_ERROR_DEVICE_LOST) {
> 1258: _assert_msg_(false, "Lost the Vulkan device in vkQueueSubmit! If this happens again, switch Graphics Backend away from Vulkan");
  1259: } else {
  1260: _assert_msg_(res == VK_SUCCESS, "vkQueueSubmit failed (main, split=%d)! result=%s", (int)splitSubmit_, VulkanResultToString(res));
00007FFDB81C8B25 0000014FB3E1B4F8 0000014F00000001 0000014FB3E1B888  ppsspp_libretro.dll!VulkanRenderManager::EndSubmitFrame  [G:\msys64\home\B-S\ppsspp\Common\GPU\Vulkan\VulkanRenderManager.cpp @ 1278]
  1276: frameData.hasBegun = false;
  1277: 
> 1278: Submit(frame, true);
  1279: 
  1280: if (!frameData.skipSwap) {
00007FFDB81C4E91 0000014FB3E1B4F8 0000004700000001 00007FFDB8C79008  ppsspp_libretro.dll!VulkanRenderManager::Run  [G:\msys64\home\B-S\ppsspp\Common\GPU\Vulkan\VulkanRenderManager.cpp @ 1322]
  1320: switch (frameData.type) {
  1321: case VKRRunType::END:
> 1322: EndSubmitFrame(frame);
  1323: break;
  1324: 
00007FFDB81C44A0 0000014FB3E1B4F8 0000000000000000 0000000000000000  ppsspp_libretro.dll!VulkanRenderManager::ThreadFunc  [G:\msys64\home\B-S\ppsspp\Common\GPU\Vulkan\VulkanRenderManager.cpp @ 430]
   428: firstFrame = false;
   429: }
>  430: Run(threadFrame);
   431: VLOG("PULL: Finished frame %d", threadFrame);
   432: }
00007FFDB81CA77D 0000014FB418D9C8 0000014FB418D9C0 0000000000000000  ppsspp_libretro.dll!std::invoke<void (__cdecl VulkanRenderManager::*)(void),VulkanRenderManager *>  [G:\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30037\include\type_traits @ 1601]
  1599:         return (_Arg1.get().*_Obj)(static_cast<_Types2&&>(_Args2)...);
  1600:     } else if constexpr (_Invoker1<_Callable, _Ty1>::_Strategy == _Invoker_strategy::_Pmf_pointer) {
> 1601:         return ((*static_cast<_Ty1&&>(_Arg1)).*_Obj)(static_cast<_Types2&&>(_Args2)...);
  1602:     } else if constexpr (_Invoker1<_Callable, _Ty1>::_Strategy == _Invoker_strategy::_Pmd_object) {
  1603:         return static_cast<_Ty1&&>(_Arg1).*_Obj;
00007FFDB81CA1F0 0000014FB418D9C0 0000000000000000 0000000000000000  ppsspp_libretro.dll!std::thread::_Invoke<std::tuple<void (__cdecl VulkanRenderManager::*)(void),VulkanRenderManager *>,0,1>  [G:\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30037\include\thread @ 55]
    53:         const unique_ptr<_Tuple> _FnVals(static_cast<_Tuple*>(_RawVals));
    54:         _Tuple& _Tup = *_FnVals;
>   55:         _STD invoke(_STD move(_STD get<_Indices>(_Tup))...);
    56:         _Cnd_do_broadcast_at_thread_exit(); // TRANSITION, ABI
    57:         return 0;
00007FFDFE164BFC 0000014FC654F490 0000000000000000 0000000000000000  ucrtbased.dll!_register_onexit_function
00007FFE3DDD7034 0000000000000000 0000000000000000 0000000000000000  KERNEL32.DLL!BaseThreadInitThunk
00007FFE3E4C2651 0000000000000000 0000000000000000 0000000000000000  ntdll.dll!RtlUserThreadStart

The core didn't want to build at first with DEBUG=1, a link warning about default libs:

LINK : warning LNK4098: defaultlib ‘LIBCMT’ conflicts with use of other libs; use /NODEFAULTLIB:library

So after reading this: https://docs.microsoft.com/en-us/cpp/error-messages/tool-errors/linker-tools-warning-lnk4098?view=msvc-160, I've added /NODEFAULTLIB:libcmt.lib /NODEFAULTLIB:msvcrt.lib /NODEFAULTLIB:libcmtd.lib here: https://github.com/hrydgard/ppsspp/blob/87723abdebd69e5482c051662b201a3c5d76e16d/libretro/Makefile#L624 and it worked.

Hopefully I did this properly...

unknownbrackets commented 3 years ago

That's interesting, but unfortunately Vulkan makes it a bit difficult to trace those issues backward (a tradeoff for speed.)

It could mean that there was an issue with those specific commands, or even mean out of memory. It can also mean a usage issue - basically it's a catch all error from the Vulkan API that means "Vulkan is broken now."

It probably means that fast forwarding isn't dealing with frames correctly in some way.

-[Unknown]

gouchi commented 3 years ago

It seems I can't reproduce this issue on Linux.

unknownbrackets commented 3 years ago

Is it possible #14674 fixed this?

-[Unknown]

bslenul commented 3 years ago

Just double checked after updating the core, it still crashes randomly on Windows.

Sanaki commented 3 years ago

I can verify this on Linux as well. The issue is not consistently triggered, but it usually occurs fairly quickly.

stuken commented 3 years ago

Got a few findings regarding this crash.

  1. I've found a potential fix in the retroarch vulkan driver. It's unfortunatly a shotgun blast when a scalpel is needed. The ffwd toggle eventually filters down to here https://github.com/libretro/RetroArch/blob/db3f0a8468a0df3f516b39c03545f95b26cd9987/gfx/drivers/vulkan.c#L1363 Whatever side effect is triggered from that call eventually leads to the vulkan device lost errors. I can't trace any further as there's a whole heap of opaque double pointers past that and I don't have enough knowledge of the driver code to make much further progress. Commenting it out does fix the device loss errors though.
  2. It also appears to be vendor related. I can't reproduce this crash on my amd apu or intel xe device. While it's easily triggerable on my rtx3090 by spamming ffwd as soon as a game starts. Does the nvidia driver not like swap intervals being changed inflight?
unknownbrackets commented 3 years ago

Most likely that goes here from some quick searching: https://github.com/libretro/RetroArch/blob/9e84c5c2c8a8d4334bd48380c76c0bffa572e36c/gfx/drivers_context/w_vk_ctx.c#L90

At least on Windows, similar callbacks on other platforms. Presumably, this causes it to recreate the swapchain.

My guess is this happens on a different thread from rendering, or something, and eventually the recreation happens while another thread is trying to do something with the swapchain and things go bad. In PPSSPP proper, we handle resizes at set points in the emulation loop, to avoid this.

-[Unknown]

DReaper commented 1 year ago

I am getting this same issue with the Beetle HW core in Steam for PS1 games. It crashes within seconds.

No crash in OpenGL or Software mode.

unknownbrackets commented 1 year ago

This GitHub issue is for PPSSPP, not about any PS1 emulator. It's probably at least somewhat unrelated to your issue, but maybe it's just the same libretro issue. For PPSSPP at least, if you want less crashes you're probably better off using PPSSPP proper.

-[Unknown]