felixdoerre / primus_vk

Vulkan GPU-offloading layer
BSD 2-Clause "Simplified" License
230 stars 18 forks source link

pvkrun unable to launch Terraria 1.4 from steam (pvkrun %command%) #67

Closed x-qq closed 4 years ago

x-qq commented 4 years ago

I am trying to launch Terraria 1.4 using pvkrun from inside steam using a custom launch command but the process immediately terminates.

In ~/.steam/error.log I see this:

primus: fatal: failed to load any of the libraries: /usr/lib/x86_64-linux-gnu/nvidia/libGL.so.1:/usr/lib/i386-linux-gnu/nvidia/libGL.so.1:/usr/lib/nvidia/libGL.so.1
/usr/lib/x86_64-linux-gnu/nvidia/libGL.so.1: cannot open shared object file: No such file or directory
/usr/lib/i386-linux-gnu/nvidia/libGL.so.1: cannot open shared object file: No such file or directory
/usr/lib/nvidia/libGL.so.1: cannot open shared object file: No such file or directory
Game removed: AppID 105600 "", ProcID 52291 
Uploaded AppInterfaceStats to Steam
Exiting app 105600

Launching the game directly as pvkrun ./Terraria.bin.x86_64 works. Launching steam as pvkrun steam and then launching the game from it (without a custom launch command) works too.

So it seems like there is some vulkan-related conflict.

primus-vk-nvidia 1.4-1 on Debian sid amd64

felixdoerre commented 4 years ago

Can you give a bit more information about what nvidia driver is installed? Maybe with the output of these commands:

$ dpkg --get-selections | grep nvidia
$ ls -als /usr/lib/*/libGLX_nvidia.*

The error itself does not contain any vulkan-related information. To me this looks as a plain primus/glvnd-driver problem. I currently don't see any indication that this is pvkrun-related.

Additionally, are you sure that the game uses vulkan and does not just use OpenGL?

x-qq commented 4 years ago

I am using nvidia-driver 440.82-2 on kernel 5.6.14-1. The GPU is GTX 1050Ti Mobile (10de:1c8c).

% dpkg --get-selections | grep nvidia
glx-alternative-nvidia              install
libegl-nvidia0:amd64                install
libegl-nvidia0:i386             install
libgl1-nvidia-glvnd-glx:amd64           install
libgl1-nvidia-glvnd-glx:i386            install
libgles-nvidia1:amd64               install
libgles-nvidia1:i386                install
libgles-nvidia2:amd64               install
libgles-nvidia2:i386                install
libglx-nvidia0:amd64                install
libglx-nvidia0:i386             install
libnvidia-cfg1:amd64                install
libnvidia-eglcore:amd64             install
libnvidia-eglcore:i386              install
libnvidia-glcore:amd64              install
libnvidia-glcore:i386               install
libnvidia-glvkspirv:amd64           install
libnvidia-glvkspirv:i386            install
libnvidia-ml1:amd64             install
nvidia-alternative              install
nvidia-driver                   install
nvidia-driver-bin               install
nvidia-driver-libs:amd64            install
nvidia-driver-libs:i386             install
nvidia-driver-libs-i386:i386            install
nvidia-egl-common               install
nvidia-egl-icd:amd64                install
nvidia-egl-icd:i386             install
nvidia-installer-cleanup            install
nvidia-kernel-4.2.0-1-amd64         deinstall
nvidia-kernel-4.3.0-1-amd64         deinstall
nvidia-kernel-common                install
nvidia-kernel-dkms              install
nvidia-kernel-support               install
nvidia-legacy-check             install
nvidia-modprobe                 install
nvidia-persistenced             install
nvidia-settings                 install
nvidia-support                  install
nvidia-vdpau-driver:amd64           install
primus-vk-nvidia                install
primus-vk-nvidia-i386:i386          install
xserver-xorg-video-nvidia           install
% 
% ls -als /usr/lib/*/libGLX_nvidia.*
0 lrwxrwxrwx 1 root root 59 May 28 18:52 /usr/lib/i386-linux-gnu/libGLX_nvidia.so.0 -> /etc/alternatives/nvidia--libGLX_nvidia.so.0-i386-linux-gnu
0 lrwxrwxrwx 1 root root 61 May 28 18:52 /usr/lib/x86_64-linux-gnu/libGLX_nvidia.so.0 -> /etc/alternatives/nvidia--libGLX_nvidia.so.0-x86_64-linux-gnu
%

I don't know for sure if the error is vulkan-related.

felixdoerre commented 4 years ago

Ok, this looks for sure like the glvnd-driver is correctly installed. I presume that this is a problem in the primus/bumblebee combination not detecting the glvnd-driver correctly and the hint from outside from prkrun does not get through to the game. I imagine that the next version of the primus+bumblebee packages that are currently worked on can fix this problem.

Are you using pvkrun %command% as custom launch command? The shown error would totally make sense for primusrun %command% or optirun %command%. It looks like the environment variable PRIMUS_libGLa that pvkrun 1.4-1 on debian sets as workaround for the glvnd-driver is missing.

Another test that might bring inside, would be to use the launcher that steam itself uses (i.e. pvkrun ./Terraria). Does that work as well or show the problem from the steam output.

x-qq commented 4 years ago

Yes, I am using pvkrun %command% in steam.

Launching Terraria directly like pvkrun ./Terraria.bin.x86_64 works (I mentioned this in the first post).

felixdoerre commented 4 years ago

I suggested you try to use the Terraria-shellscript which is a wrapper around the binaries, but I tested that locally, and it seems to work for me (but the error message I get when running from steam is different)

I have the theory that this problem is caused by exactly this Terraria-shellscript-wrapper. You can try this by setting the launch command from steam to: pvkrun ./Terraria.bin.x86_64 #%command%. When I try this, this works as a workaround. Can you verify that as well?

x-qq commented 4 years ago

Yes, I can verify that launch command pvkrun ./Terraria.bin.x86_64 #%command% works.

Additionally, using the shellscript directly from outside steam as pvkrun ./Terraria works too.

felixdoerre commented 4 years ago

Hi, I am pretty sure now, that I understand what goes wrong here and am currently trying to understand how to address this best, with some debian developers as this is a debian-specific problem. I am also fairly certain that this is not a primus-vk problem, but rather a primus problem, but I am still trying to find a solution and get this resolved.

If you are interested in the details of what happens, please leave a note and I can write a complete explanation. Otherwise I will update this issue when I have news on when/how this will be fixed in debian.

x-qq commented 4 years ago

Hi, I would be interested in a description of what causes the problem, I am using Debian after all.

felixdoerre commented 4 years ago

Ok, What is involved in the problem: 1) We LD_PRELOAD=: gameoverlayrenderer.so. Steam always does this for the Steam overlay. This is missing when the game is launched from outside of Steam. 2) In order to run a game with primus, we need to set LD_LIBRARY_PATH in order to load the libGL.so.1 from primus instead of the normal libGL.so.1. 3) The game is launched by a /bin/bash wrapper.

What happens: 1) bash is started. 2) gameoverlayrenderer.so is preloaded 3) as gameoverlayrenderer.so is directly dependent on libGL.so.1, we load the primus libGL.so.1 as a dependency 4) the primus libGL.so.1 sets the environment variable __GLVND_DISALLOW_PATCHING, by calling putenv: (https://salsa.debian.org/nvidia-team/primus/-/blob/23a9620ef33d5a623d55731c6c8ca588fdb65a28/debian/patches/glvnd.patch#L10). This is required to prevent glvnd from courageously binary-patching OpenGL-Api functions inside the RAM when the OpenGL context is changed (https://github.com/NVIDIA/libglvnd/blob/bed48a107cc0d3a810ff642f5d013101ae3f4c63/src/GLdispatch/GLdispatch.c#L340). This is not good for primus, as it wants to access two different contexts simultaneously. 5) However, putenv is Hijacked by bash: https://salsa.debian.org/debian/bash/-/blob/6a0146056618a32be35ba89e1b092eda2f2fav.c#L101 this is a mechanism (or Hack?) by bash to quickly export its own environment to library functions. This own environment from is not initialized yet, as the main()-function form bash has not been called yet. We are still initializing libraries at this point. So putenv, does not really modify the environment, but only stores this environment variable in the bash shell_variables-structure. 6) Primus calls XOpenDisplay(NULL) (which is forwarded by gameoverlayrenderer.so to libX11.so), which calls getenv("DISPLAY"). getenv is also hijacked by bash. However the external variables are still not imported. Bash has a mechanism to forward such calls to the "real" environment (https://salsa.debian.org/debian/bash/-/blob/6a0146056618a32be35ba89e1b092eda2f2fa749/lib/sh/getenv.c#L74). This mechanism only activates when the shell_variables-structure is absent. The previous call to putenv however already initialized this structure. So we don't find a DISPLAY environment variable, and libX11.so returns an error.

What is wrong here? What could be done to resolve this problem?

There are several alternatives: 1) the mechanism in bash is just wrong as it breaks the environment when it is used before bash's main() is started (Which would have happend after all the steps I described above). Interestingly it is only broken if one calls putenv (or setenv) before calling getenv, so not many applications go into this trap. So as a resolution putenv (and setenv) from bash should be changed to directly forward to libc's putenv (or setenv), if the bash variables have not been initialized yet. 2) Do not set any environment variable from within primus. So we would remove the debian-patch and set __GLVND_DISALLOW_PATCHING outside, from primusrun or optirun. 3) Do not use putenv/setenv but instead directly modify the environ libc-internal global variable. This would really be a hack. 4) Use dlopen/dlsym to get the location of the real setenv function and call this from primus instead.

Note: Interestingly this kind-of also might affect primus-vk, as it also sets an environment variable here: https://github.com/felixdoerre/primus_vk/blob/master/nv_vulkan_wrapper.cpp#L88 However this is not a problem (yet) as gameoverlayrenderer.so does not load libvulkan and thereby libnv_vulkan_wrapper (at least not from within the early initializers).

felixdoerre commented 4 years ago

Hi, there was just uploaded a primus version that should solve/work around this issue into debian sid: 0~20150328-11 You should receive it in the next hours via normal system updates. Could you re-test and report back when you have received the update?

Just for reference: the bug against the debian's bash package is opened here: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=962566 Hopefully the maintainer will submit this bug upstream and the issue gets solved, we will see.

felixdoerre commented 4 years ago

No, that's correct: primus-libs-ia32 is an outdated mechanism that did never contain any real files but was just use to recommend the 32-bit version of primus-libs. Now this recommendation is directly included in the primus-libs-package, which directly recommends its 32-bit version: https://salsa.debian.org/nvidia-team/primus/-/blob/412b48fa44080669e38d9e09f9601c48d9f421da/debian/control#L46

So summing it up: the removal of primus-libs-ia32 is a cleanup that is expected.

x-qq commented 4 years ago

Thanks, the fix works. With updated primus-libs package I am able to use launch command pvkrun %command% now.

felixdoerre commented 4 years ago

Great, so I assume this issue is resolved now and close it. If you have any other problems with primus_vk feel free to open a new issue or re-open this issue if appropriate.