felixdoerre / primus_vk

Vulkan GPU-offloading layer
BSD 2-Clause "Simplified" License
230 stars 18 forks source link

Newer loader work breaks with steamoverlay (e.g. in ASTRONEER or Shadow of War) #53

Closed grazzolini closed 4 years ago

grazzolini commented 4 years ago

I have updated Arch's package to version 1.2 to find out it breaks a very specific game for me, ASTRONEER. I've done a bit of bisecting into which specific commit breaks it and it seems that 134535c4d60cc1d3271d825ade08470540af539b triggers it.

I don't know if it breaks any other games, but I've tested a few others, using steam/proton/dxvk, and the ones I've tested worked fine.

What would you need in terms of debugging to help find out why this newer loader breaks only this game?

felixdoerre commented 4 years ago

The first question would be: What exactly means "breaks"? Segfault and crash? Segfault during startup? The game refusing to start with an error message? Something else?

If there is a segfault, it would be best to have a stacktrace: So use primus_vk with debug symbols, and then gdb to generate a stack trace. Also stdout of the game could have hints about how far initialization got and what is wrong.

So more concrete steps on "how to debug" the proton game: You can configure Steam to run the game with PROTON_DUMP_DEBUG_COMMANDS=1 to get a "runner" in "/tmp/proton_$USER/run" to start the game manually (and see stdout/stderr, for documentation see https://github.com/ValveSoftware/Proton#runtime-config-options). To run the game (including proton) with gdb, I usually patch this runner by changing the last line .../dist/bin//wine" "${@:-${DEF_CMD[@]}}" to .../dist/bin//wine" gdb --args "${@:-${DEF_CMD[@]}}".

grazzolini commented 4 years ago

I can get you logs of the failure, for sure. It's not a segfault from what I've seen, but I'll recompile primus_vk with -g and see what pops up. Regarding proton's debug commands, here's where things gets interesting. If I use the runner that's created on tmp, the game loads, but then my machine completely freezes, not even magic sysrq keys work, so I can't even get the output of that.

I'll try later to generate the logs and rebuild with debug.

grazzolini commented 4 years ago

Ok, so here is the log of the game when called through steam. You can get most of my system relevant information from here. By the way, this is running the game with the debug compiled version of primus_vk and the following launch option on steam:

PROTON_DUMP_DEBUG_COMMANDS=1 PROTON_LOG=1 pvkrun %command%

steam-361420.log

When I use the unmodified ./run script that's created on /tmp, the game is now loading and my machine is not freezing, which is puzzling. Even when running the modified script as you've suggested, it's working. So, the game is not loading from steam only. But it works when calling it using the run scripts.

Edited to add:

The scripts created when using PROTON_DUMP_DEBUG_COMMANDS aren't very useful, because they do not include pvkrun in them. So, it seems the game is running, but using the intel igpu card, not nvidia. When changing the script to actually call pvkrun, the symptom is the same, but it doesn't segfault, as you can see on the log above, so the gdb is not very useful here.

felixdoerre commented 4 years ago

Yes, the proton-debug scripts only cover the "run with steam and proton"-part but leave out any additional configuration. Sorry, that I missed to explain that. I generally find them useful, especially for games that refuse to start if not started from steam. Those run-scripts set all environment variables so that the game believes it is run from Steam, which allows to run/debug the game directly. Why your machine freezes when you run it with gdb, is unclear to me.

So its pretty clear that it crashes in the steamoverlay:

0x00007f20050eb7c2 vkCreateSwapchainKHR+0xffffffffffffffff() in steamoverlayvulkanlayer.so (0x0000000000000001)
0x00007f20050eb7c2 vkCreateSwapchainKHR+0xffffffffffffffff in steamoverlayvulkanlayer.so: movq  0x0000000000000018(%rax),%rax

Have you tried disabling the steam overlay with DISABLE_VK_LAYER_VALVE_steam_overlay_1=1 as an additional environment variable?

I also observed something similar in another game, that crashes as well, however it crashes in a different function (vkQueueSubmit) of steamoverlay.

grazzolini commented 4 years ago

It works when running with DISABLE_VK_LAYER_VALVE_steam_overlay_1=1. Do you think this might be a bug on the steam overaly for vulkan? I can report this upstream if it is.

felixdoerre commented 4 years ago

I think it is, but I don't have the source code of the overlay to verify and check what's happening. When you report this issue upstream could you share a reference to the report so I can keep an eye on it? (maybe I can help, if this turns out to be a primus_vk issue after all)

grazzolini commented 4 years ago

This seems to be affecting also non proton games. Dota 2, Dota Underlords, KSP, all of them don't work with primus_vk 1.2 here. Not even passing the DISABLE_VK_LAYER_VALVE_steam_overlay_1=1

felixdoerre commented 4 years ago

I believe I found the issue (and it is a primus_vk issue). Could you try 9b8c449 and see if the problem persists?

grazzolini commented 4 years ago

That one fix ASTRONEER for sure. But the funny thing is, I don't think I've tried using Dota 2 or KSP with vulkan before, but neither works with primus_vk (I'm not sure if it's related). Also, Dota Underlods, which is pure vulkan, doesn't work with primus_vk as well. I have backported that commit and pushed a new package on Arch.

Having said that, I think there's still some work to do on native games.

felixdoerre commented 4 years ago

I've just tried KSP, and I can't find an option for the game to use Vulkan and the "normal" primusrun-variant for OpenGL seems to work well. I've also just installed Dota 2 with the "Vulkan DLC" and launched it with -vulkan and also that has no problems on my system. Dota Underlords also works well with primus_vk here, I am unable to launch it myself from the command line because I don't seem to be able to initialize the "steam runtime" correctly, but launching it from Steam works flawlessly.

So all in all this seems to me more like being a setup issue than an actual bug. (And I think, you can mark the comment for ASTRONEER on Proton regarding the overlay as resolved)

grazzolini commented 4 years ago

For KSP you have to run it using: pvkrun %command% -force-vulkan, here are the options.

For dota 2, I have the vulkan DLC and I can run using vulkan on my intel igpu, but if I run pvkrun %command% with vulkan, it doesn't work. I can run fine the opengl version with pvkrun (because it fallback to primus)

Same for Underlords (runs fine on the igpu), but doesn't with pvkrun. And since it's pure vulkan, I can't get it running on my nvidia card. If it helps I can attach my system info here.

felixdoerre commented 4 years ago

Hmm... the vulkan mode of my KSP seems broken, even when I launch without primus_vk.

 ./KSP.x86_64 -force-vulkan
Set current directory to /home/user/Steam/SteamLinuxLibrary/steamapps/common/Kerbal Space Program
Found path: /home/user/Steam/SteamLinuxLibrary/steamapps/common/Kerbal Space Program/KSP.x86_64
Mono path[0] = '/home/user/Steam/SteamLinuxLibrary/steamapps/common/Kerbal Space Program/KSP_Data/Managed'
Mono config path = '/home/user/Steam/SteamLinuxLibrary/steamapps/common/Kerbal Space Program/KSP_Data/Mono/etc'
Preloaded 'ScreenSelector.so'
Preloaded 'libkeyboard.so'
Preloaded 'liblingoona.grammar.kerbal.so'
Preloaded 'libsteam_api.so'
Unable to preload the following plugins:
    ScreenSelector.so
    libkeyboard.so
    liblingoona.grammar.kerbal.so
    libsteam_api.so
Player data archive not found at `/home/user/Steam/SteamLinuxLibrary/steamapps/common/Kerbal Space Program/KSP_Data/data.unity3d`, using local filesystem
Logging to /home/user/.config/unity3d/Squad/Kerbal Space Program/Player.log
Xlib:  extension "NV-GLX" missing on display ":0".

So I cannot test what primus_vk does wrong here. As for the other games: Do they crash? Do they hang? Can you provide a backtrace if they crash?

grazzolini commented 4 years ago

They crash:

Dota 2

Process 83219 (dota2) of user 1000 dumped core.

                                                           Stack trace of thread 83219:
                                                           #0  0x00007f24a6b99c90 vkGetPhysicalDeviceProperties (libvulkan.so.1)
                                                           #1  0x00007f24a71b91b6 n/a (/home/lock/.local/share/Steam/steamapps/common/dota 2 beta/game/bin/linuxsteamrt64/librendersystemvulkan.so)
                                                           #2  0x00007f24b091c033 n/a (/home/lock/.local/share/Steam/steamapps/common/dota 2 beta/game/bin/linuxsteamrt64/libengine2.so)
                                                           #3  0x00007f24b091cfb5 n/a (/home/lock/.local/share/Steam/steamapps/common/dota 2 beta/game/bin/linuxsteamrt64/libengine2.so)
                                                           #4  0x00007f24b06ba2db n/a (/home/lock/.local/share/Steam/steamapps/common/dota 2 beta/game/bin/linuxsteamrt64/libengine2.so)
                                                           #5  0x00007f24b06bcf06 n/a (/home/lock/.local/share/Steam/steamapps/common/dota 2 beta/game/bin/linuxsteamrt64/libengine2.so)
                                                           #6  0x00007f24b05fcc3b n/a (/home/lock/.local/share/Steam/steamapps/common/dota 2 beta/game/bin/linuxsteamrt64/libengine2.so)
                                                           #7  0x00007f24b05fd123 n/a (/home/lock/.local/share/Steam/steamapps/common/dota 2 beta/game/bin/linuxsteamrt64/libengine2.so)
                                                           #8  0x0000562f6148beda n/a (/home/lock/.local/share/Steam/steamapps/common/dota 2 beta/game/bin/linuxsteamrt64/dota2)
                                                           #9  0x00007f24b1004ee3 __libc_start_main (libc.so.6)
                                                           #10 0x0000562f6148bf89 n/a (/home/lock/.local/share/Steam/steamapps/common/dota 2 beta/game/bin/linuxsteamrt64/dota2)

Underlords:

Process 83880 (underlords) of user 1000 dumped core.

                                                           Stack trace of thread 83880:
                                                           #0  0x00007fa8f4e64c90 vkGetPhysicalDeviceProperties (libvulkan.so.1)
                                                           #1  0x00007fa8f5484206 n/a (/home/lock/.local/share/Steam/steamapps/common/Underlords/game/bin/linuxsteamrt64/librendersystemvulkan.so)
                                                           #2  0x00007fa8febe84a3 n/a (/home/lock/.local/share/Steam/steamapps/common/Underlords/game/bin/linuxsteamrt64/libengine2.so)
                                                           #3  0x00007fa8febe9425 n/a (/home/lock/.local/share/Steam/steamapps/common/Underlords/game/bin/linuxsteamrt64/libengine2.so)
                                                           #4  0x00007fa8fe98685b n/a (/home/lock/.local/share/Steam/steamapps/common/Underlords/game/bin/linuxsteamrt64/libengine2.so)
                                                           #5  0x00007fa8fe989486 n/a (/home/lock/.local/share/Steam/steamapps/common/Underlords/game/bin/linuxsteamrt64/libengine2.so)
                                                           #6  0x00007fa8fe8c91db n/a (/home/lock/.local/share/Steam/steamapps/common/Underlords/game/bin/linuxsteamrt64/libengine2.so)
                                                           #7  0x00007fa8fe8c96c3 n/a (/home/lock/.local/share/Steam/steamapps/common/Underlords/game/bin/linuxsteamrt64/libengine2.so)
                                                           #8  0x000055aee6e3eeda n/a (/home/lock/.local/share/Steam/steamapps/common/Underlords/game/bin/linuxsteamrt64/underlords)
                                                           #9  0x00007fa8ff2d1ee3 __libc_start_main (libc.so.6)
                                                           #10 0x000055aee6e3ef89 n/a (/home/lock/.local/share/Steam/steamapps/common/Underlords/game/bin/linuxsteamrt64/underlords)

And KSP is giving me the same output as yours.

felixdoerre commented 4 years ago

The location looks very similar to the stack trace from #51. However I still don't know what is wrong there.

grazzolini commented 4 years ago

So, today I got some time to dig a little bit more into this issue. I have vulkan-intel and primus_vk installed, as I should to be able to run games using bumblebee + primus. If I run pvkrun vulkaninfo, it detects both cards, my intel as the first one and my nvidia as second. And, it lists the layers. At some point, vulkaninfo segfaults. I've noticed this before and talked with the vulkan-tools maintainer for Arch and he told me that was normal. And, since primus_vk works for the games I was playing at the time, I didn't consider it important.

So, today I tried to run things using nvidia-xrun. From within that Xorg server, I got the same result when running vulkaninfo, it lists both cards, but the nvidia as first and intel as second (because of nvidia-xrun, that's expected), but it segfaults as well. This means, of course, my icd.d directory has the 3 json files, intel's, primus_vk's and nvidia's.

So, I go and remove both vulkan-intel and primus_vk. Tried running vulkaninfo, it only detects my nvidia card and it doesn't segfault. Then, I tried to run dota 2, and, to my surprise, it runs flawlessly. Same with Dota Underlords.

I think there's a bigger issue at play here, other than only primus_vk. And, by the way, it doesn't work if I force the order of the icd's, I need to completely remove both vulkan-intel and primus_vk for it to work under nvidia-xrun.

felixdoerre commented 4 years ago

At some point, vulkaninfo segfaults

I think that's not good. I just debugged my vulkaninfo (it turned out to segfault as well when run with primus_vk) and found out what causes the segfault. Here is a fix: bfed91d Now vulkaninfo terminates successfully both with and without primus_vk on my system.

Could you try the fix, if it improves running Dota on your system?

bno1 commented 4 years ago

About the vulkaninfo crash, I get the crash when running ENABLE_PRIMUS_LAYER=1 primusrun vulkaninfo but not when running ENABLE_PRIMUS_LAYER=1 optirun -b primus vulkaninfo.

EDIT: this happened before felixdoerre's fix. I think it's strange how optirun and primusrun behave differently.

grazzolini commented 4 years ago

I have applied that patch and indeed I can run vulkaninfo with pvkrun without any segfault. but dota 2 and dota underlords won't run regardless.

I have switched my configuration to use the nvidia card only, without vulkan-intel + bumblebee + primus + primus_vk, and both games play just fine using the nvidia card. But, that's not optimal, because I'm not getting any of the benefits of running the dgpu only when needed.

Edited to add: @bno1, optirun doesn't work with primus_vk, as far as I know. So, when you run ENABLE_PRIMUS_LAYER=1 optirun -b primus vulkaninfo, it's like if you were running vulkaninfo directly.

felixdoerre commented 4 years ago

From my point of view ENABLE_PRIMUS_LAYER=1 optirun -b none, ENABLE_PRIMUS_LAYER=1 optirun -b primus and ENABLE_PRIMUS_LAYER=1 primusrun should all work. I usually start applications with optirun and it works fine.

@grazzolini Could you please run dota 2 / dota underlords with gdb and validate the contents of the physicalDevice parameter to vkGetPhysicalDeviceProperties?

felixdoerre commented 4 years ago

I close this issue, as the original problem (mentioned in the issue's title) is resolved now and the segfault seems to be similar to #51

grazzolini commented 4 years ago

Yes, this is ok to close. Also, primus_vk on Arch backports the patches.

felixdoerre commented 4 years ago

I think that I will release those changes in the next days as v1.3. I hoped to be able to solve the other segfault, but was not able to reproduce it in a clean arch live system. So now I'm still cleaning up other things and will then release this version as v1.3.