libretro / RetroArch

Cross-platform, sophisticated frontend for the libretro API. Licensed GPLv3.
http://www.libretro.com
GNU General Public License v3.0
10.33k stars 1.84k forks source link

[Vulkan] On the second piece of content that is loaded in a current session, RA will crash when selecting "close content" from the quick menu #8062

Open felixlang opened 5 years ago

felixlang commented 5 years ago

Description

RetroArch will crash with 90% frequency when selecting "Close Content" from the quick menu, on the second piece of content that is loaded in the current session. The first piece of content that is loaded will successfully close and return to the main menu.

This is the crash data provided by Windows:

Problem signature: Problem Event Name: APPCRASH Application Name: retroarch.exe Application Version: 0.0.0.0 Application Timestamp: 00000000 Fault Module Name: nvfatbinaryLoader.dll Fault Module Version: 25.21.14.1735 Fault Module Timestamp: 5c0f54c3 Exception Code: c0000005 Exception Offset: 0000000000042705 OS Version: 6.1.7601.2.1.0.256.48 Locale ID: 1033 Additional Information 1: ecb5 Additional Information 2: ecb5bcbbb13d343c98267132fb0fdef4 Additional Information 3: b876 Additional Information 4: b876943fdcfefeb51d7e7b5b2080bf0a

This is the crash log from retroarch_debug.exe:

crash-190122-083725.log

This crash occurs on every core. Specific cores tested:

This crash also happens regardless of how the content is loaded within RA - from the history playlist, from the system playlist, or directly from the hard drive using the "Load Content" menu.

In approximately 10% of RA sessions, this crash will not occur no matter how much content is loaded and closed. I have no idea why that happens, but I know that nothing is being materially altered by me between a crashed session and a successful session:

In other words, if 10 sessions of RA are started and stopped in succession without any interruption, the results will be crash, crash, crash, no crash, crash crash... etc.

Expected behavior

RetroArch closes the content and returns to the main menu

Actual behavior

RetroArch crashes.

Steps to reproduce the bug

  1. Start RetroArch
  2. Load any content with any core
  3. Close content from the quick menu
  4. Load any other content with any core
  5. Close content from the quick menu
  6. Crash (RA does not successfully close the content)

Bisect Results

I do not remember if this happened in all previous versions of RA that I have tried, but I'm mostly certain that it was also a problem in 1.7.4. It also occurs on the latest nightly build (1-21-2019).

The log file is attached from the most recent session, using the 1-21-2019 nightly build. The final line in the log ("[INFO] [Vulkan]: VSync => on") is always the same with every crash.

retroarch-log.txt

Version/Commit

Environment information

inactive123 commented 5 years ago

I cannot reproduce this with the current master.

I follow your steps instead I load the games from the history list. Try this on your end to see if it makes any difference.

felixlang commented 5 years ago

I cannot reproduce this with the current master.

I follow your steps instead I load the games from the history list. Try this on your end to see if it makes any difference.

Tried this and no change. RetroArch still crashes when trying to close the second piece of content, even when both items are loaded from the history list instead of from each system's playlist.

The crash also still happens even if both items are loaded directly from the hard drive using the "Load Content" menu.

inactive123 commented 5 years ago

I'm using Windows 10 for the record.

ghost commented 5 years ago

Can you paste a backtrace using the debug build? Either using Dr. Mingw or gdb.

felixlang commented 5 years ago

Can you paste a backtrace using the debug build? Either using Dr. Mingw or gdb.

Sure. Sorry I didn't think to do that earlier. The log is attached from the most recent crash (used Dr.MinGW).

crash-190122-083725.log

orbea commented 5 years ago

Here is a similar issue that was fixed, I would of thought they were the same?

https://github.com/libretro/RetroArch/issues/4107

orbea commented 5 years ago

Does this only happen with vulkan and it doesn't crash if you use opengl?

felixlang commented 5 years ago

Does this only happen with vulkan and it doesn't crash if you use opengl?

That is correct. I switched to the GL video driver just now on the 1-21-2019 build and ran through the procedure half a dozen times just to make sure, and did not get a single crash. So it is only a Vulkan issue for me.

Here is a similar issue that was fixed, I would of thought they were the same?

4107

That appears to be a different problem, and also one that has been fixed. That bug involved loading a second core while a different core was already loaded and running. When I went through the steps provided in that thread, I did not experience a crash, only a brief error message from RA.

orbea commented 5 years ago

I would guess that since other people can not experience this and its vulkan only it may be a video driver issue.

Are you using vulkan with nvidia or intel? Does it stop happening when using the other one?

felixlang commented 5 years ago

I would guess that since other people can not experience this and its vulkan only it may be a video driver issue.

Are you using vulkan with nvidia or intel? Does it stop happening when using the other one?

I've only ever used my 980 Ti for graphics on my PC. I've never even installed the driver for the integrated Intel graphics.

I assume though that the answer would be yes - there would be no problems with the Intel graphics, based on the crash report that indicates the problem with nvfatbinaryLoader.dll. I don't know what the purpose of that dll file is, but Google searching at least makes it clear that it is directly related to Nvidia's GPU drivers.

ghost commented 5 years ago

Are you using the threaded video setting? If so try without it. Are your nvidia drivers up to date? Could be a possible driver bug.

felixlang commented 5 years ago

Are you using the threaded video setting? If so try without it. Are your nvidia drivers up to date? Could be a possible driver bug.

Threaded video is disabled. I have no trouble achieving full speed with the cores I use so there's never been any reason to enable it.

My current GPU driver is 417.35, which was released on December 12, 2018. It's not the latest version, but the chance of that being the cause of this crash is effectively zero, because it's been a problem for several months (going back to at least October 2018 when RA 1.7.5 was released).

orbea commented 5 years ago

Could you try a newer nvidia driver in case they fixed it?

felixlang commented 5 years ago

Could you try a newer nvidia driver in case they fixed it?

I updated to the latest GPU driver today (417.71), but the crashing problem in RA is still present.

AaronBPaden commented 5 years ago

I'm also getting a crash on Linux when selecting "close content". In this case, it appears to only happen when I select a shader. @felixlang are you using shaders? Does it crash if you aren't?

Here is a stack trace.

           PID: 21465 (retroarch)
           UID: 1000 (aaron)
           GID: 1000 (aaron)
        Signal: 6 (ABRT)
     Timestamp: Thu 2019-03-14 15:13:45 CDT (12s ago)
  Command Line: retroarch
    Executable: /usr/bin/retroarch
 Control Group: /user.slice/user-1000.slice/user@1000.service/gnome-terminal-server.service
          Unit: user@1000.service
     User Unit: gnome-terminal-server.service
         Slice: user-1000.slice
     Owner UID: 1000 (aaron)
       Boot ID: cf1c894c776e47f78a3ae27b1337a5ad
    Machine ID: 7de708c9f6e4468680e120c34daef981
      Hostname: localhost
       Storage: /var/lib/systemd/coredump/core.retroarch.1000.cf1c894c776e47f78a3ae27b1337a5ad.21465.1552594425000000.lz4
       Message: Process 21465 (retroarch) of user 1000 dumped core.

                Stack trace of thread 21465:
                #0  0x00007f69576eed7f raise (libc.so.6)
                #1  0x00007f69576d9672 abort (libc.so.6)
                #2  0x00007f69576d9548 __assert_fail_base.cold.0 (libc.so.6)
                #3  0x00007f69576e7396 __assert_fail (libc.so.6)
                #4  0x00005645783df469 _ZN7glslang10InitThreadEv (retroarch)
                #5  0x00005645784627a3 _ZN7glslang7TShader10preprocessEPK16TBuiltInResourcei8EProfilebb11EShMessagesPNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEERNS0_8IncluderE (retroarch)
                #6  0x0000564578389942 _ZN7glslang13compile_spirvERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEENS_5StageEPSt6vectorIjSaIjEE (retroarch)
                #7  0x0000564578385a1b _Z22glslang_compile_shaderPKcP14glslang_output (retroarch)
                #8  0x000056457836120c vulkan_filter_chain_create_from_preset (retroarch)
                #9  0x000056457834ead8 vulkan_init_filter_chain_preset (retroarch)
                #10 0x000056457834f950 vulkan_init (retroarch)
                #11 0x00005645781370ad video_driver_init (retroarch)
                #12 0x00005645780db4da drivers_init (retroarch)
                #13 0x00005645780de536 retroarch_main_init (retroarch)
                #14 0x00005645780f8338 content_load (retroarch)
                #15 0x00005645780f8ec8 task_push_start_dummy_core (retroarch)
                #16 0x00005645780e9644 command_event (retroarch)
                #17 0x00005645782d55db action_ok_close_content (retroarch)
                #18 0x00005645782cf2fe menu_entry_action (retroarch)
                #19 0x0000564578312152 generic_menu_iterate (retroarch)
                #20 0x00005645782b500f menu_driver_iterate (retroarch)
                #21 0x00005645780e0fd0 runloop_check_state.constprop.9 (retroarch)
                #22 0x00005645780e1c80 runloop_iterate (retroarch)
                #23 0x00005645781da2c5 _ZL21ui_application_qt_runPv (retroarch)
                #24 0x00005645780d91d2 rarch_main (retroarch)
                #25 0x00007f69576db223 __libc_start_main (libc.so.6)
                #26 0x00005645780d5bce _start (retroarch)

                Stack trace of thread 21577:
                #0  0x00007f695a4f6afc pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0)
                #1  0x00007f694f2b07f4 n/a (libvulkan_radeon.so)
                #2  0x00007f694f2b0518 n/a (libvulkan_radeon.so)
                #3  0x00007f695a4f0a9d start_thread (libpthread.so.0)
                #4  0x00007f69577b2b23 __clone (libc.so.6)

                Stack trace of thread 21578:
                #0  0x00007f695a4f6ef6 pthread_cond_timedwait@@GLIBC_2.3.2 (libpthread.so.0)
                #1  0x00007f694f2a573b n/a (libvulkan_radeon.so)
                #2  0x00007f695a4f0a9d start_thread (libpthread.so.0)
                #3  0x00007f69577b2b23 __clone (libc.so.6)

                Stack trace of thread 21576:
                #0  0x00007f695a4f6afc pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0)
                #1  0x00005645780f5b84 threaded_worker (retroarch)
                #2  0x000056457831e8df thread_wrap (retroarch)
                #3  0x00007f695a4f0a9d start_thread (libpthread.so.0)
                #4  0x00007f69577b2b23 __clone (libc.so.6)
AaronBPaden commented 5 years ago

Oh is this related to #4664? I thought I had seen this issue before...

felixlang commented 5 years ago

@BPaden

I'm not using any shaders so that is unrelated to my problem

It might be related to #4664, but I doubt it. In my situation, I can always close the first loaded into RA without it crashing when I close it with the quick menu. It's only on the second loaded game where the crashes are happening.