ValveSoftware / steam-runtime

A runtime environment for Steam applications
Other
1.21k stars 86 forks source link

libGL error: MESA-LOADER: failed to open radeonsi with Proton 5.13-1 #309

Open AwesamLinux opened 4 years ago

AwesamLinux commented 4 years ago

System Information

Symptoms

Most games I have tested are no longer launching with Proton 5.13-1, these are games that I tried and still works for me with Proton 5.0-9.

I think I have the same/related issue as ValveSoftware/Proton#4278 and ValveSoftware/Proton#4269, however those report says can't run anything but that is not the case for me, therefore I'm posting this a separate issue just in case.

I included a bunch of logs trying to launch games with Proton 5.13-1. All these games do still work for me with Proton 5.0-9 but don't launch with 5.13-1 (nothing happens). I also included some logs of games that did launch for me with 5.13-1 for comparison.

FAILED TO LAUNCH: steam-1349960 The Ghost Train - FAIL steam-501300 What Remains of Edith Finch - FAIL steam-1057750 The Suicide of Rachel Foster - FAIL steam-343710 Kholat - FAIL steam-1122720 Sayonara Wild Hearts - FAIL steam-752590 A Plague Tale: Innocence - FAIL steam-826940 Maid of Sker - FAIL

SUCCESSFULLY LAUNCHED: steam-368430 Through the Woods - SUCCESS steam-217920 Alien Rage - SUCCESS steam-321270 UNLOVED - SUCCESS

What I have tried:

Logs: Logs of launching all these games with Proton 5.13-1

logs.zip system_info.txt

Edit: System information after waiting for runtime info to be filled in: system_info.txt

Update: I tried now also launching a few games with the Proton 5.13 debug branch: logs_debug.zip

I have also tried verifying the "Steam Linux Runtime - Soldier" files, it is curiously for me listed as a game not a tool (that does not seem right?)

jaubin commented 4 years ago

On NVIDIA blob 450.66 :

A Plague Tale : Innocence - OK

I could not try the other games.

guglovich commented 4 years ago

SUCCESSFULLY LAUNCHED: Witcher 3, Bayonetta, Borderlands GOTY Enhanced, Endless space, S.T.A.L.K.E.R.: Shadow of Chernobyl, The Walking Dead

FAILED TO LAUNCH: Remember Me (Also doesn't work with 5.0.9) Other game - Lineage 2

aeikum commented 4 years ago

Looks like all of your failing logs contain this:

libGL error: MESA-LOADER: failed to open radeonsi (search paths /overrides/lib/x86_64-linux-gnu/dri:/overrides/lib/i386-linux-gnu/dri)
libGL error: failed to load driver: radeonsi
libGL error: MESA-LOADER: failed to open radeonsi (search paths /overrides/lib/x86_64-linux-gnu/dri:/overrides/lib/i386-linux-gnu/dri)
libGL error: failed to load driver: radeonsi
libGL error: MESA-LOADER: failed to open swrast (search paths /overrides/lib/x86_64-linux-gnu/dri:/overrides/lib/i386-linux-gnu/dri)
libGL error: failed to load driver: swrast
X Error of failed request:  GLXBadContext
  Major opcode of failed request:  152 (GLX)
  Minor opcode of failed request:  6 (X_GLXIsDirect)
  Serial number of failed request:  281
  Current serial number in output stream:  280

So something is going wrong loading your GL driver from within the runtime container.

TTimo commented 4 years ago

@AwesamLinux please include a complete 'System Information' report - e.g. please wait for the runtime information tool section to fill in.

AwesamLinux commented 4 years ago

@TTimo here is the full system info: system_info.txt

TTimo commented 4 years ago

On your machine, 64 bit GL drivers are not working in either the scout or soldier runtime containers (32 bit seems fine though, and LD_* runtime is generally fine also).

The steam installation is distro altered - I would recommend using official Valve packages instead (http://repo.steampowered.com/steam/), although that's not likely related to the driver problem.

kisak-valve commented 4 years ago

undefined symbol: drmGetDevices2 and undefined symbol: amdgpu_cs_query_reset_state2 looks interesting from the system information. Somehow the driver is picking up a variant of libdrm that's too old for that build of mesa.

smcv commented 4 years ago

overrides/lib/x86_64-linux-gnu/libdrm.so.2 -> /run/host/opt/amdgpu/lib/x86_64-linux-gnu/libdrm.so.2.4.0 overrides/lib/i386-linux-gnu/libdrm.so.2 -> /run/host/usr/lib/i386-linux-gnu/libdrm.so.2.4.0

pressure-vessel doesn't currently support graphics drivers or their dependencies installed outside /usr. I expect this will be the problem for @AwesamLinux.

Is this something you are deliberately still using, or something that is left over from an old installation? I notice libGLX_mesa.so.0 is still in /usr.

We cannot know whether other users who have commented have the same issue, a related issue, or something completely unrelated.

64 bit GL drivers are not working in either the scout or soldier runtime containers (32 bit seems fine though)

That'd be because the ones in /opt are 64-bit-only, so they can only break 64-bit. For 32-bit, the drivers in /usr seem to be fine.

overrides/lib/x86_64-linux-gnu/libOpenCL.so.1 -> /run/host/opt/amdgpu-pro/lib/x86_64-linux-gnu/libOpenCL.so.1

I didn't know amdgpu-pro was even still a thing.

smcv commented 4 years ago

Somehow the driver is picking up a variant of libdrm that's too old for that build of mesa

I think what's happening here is:

AwesamLinux commented 4 years ago

@smcv I just add a PPA with recent Mesa drivers and let it install them wherever it does by default. I tried purging ppa:ernstp/mesaaco and tried it now with ppa:kisak/kisak-mesa instead, but made no difference.

However I have manually installed OpenCL from the amd-gpu-pro drivers in headless mode, so maybe that causes issues somehow (most using professional software like Blender do that, because the OpenCL provided with Mesa does not support everything).

AwesamLinux commented 4 years ago

Proton 5.13-1 is working for me now, thanks for the support and ideas :tada:

I uninstalled the AMD-GPU-PRO OpenCL driver that I had previously installed in headless mode. After uninstalling it and rebooting games are launching fine now with Proton 5.13-1.

So I guess the OpenCL driver somehow interferes with Proton 5.13-1 :man_shrugging:!?. I guess this could perhaps cause issues to some other AMD users too, as I think it is fairly common to install just install the AMD GPU PRO OpenCL driver for some professional apps.

Just for reference including a log, and system info of my now working setup: steam-826940.log.zip system_info.txt

kisak-valve commented 4 years ago

Let's keep this issue open for a while, at least for visibility, as it's not uncommon for users to tinker with amdgpu-pro.

smcv commented 4 years ago

So I guess the OpenCL driver somehow interferes with Proton 5.13-1

I think the conditions for getting this issue are:

That will confuse the container runtime system that we use to give Proton 5.13 its required runtime environment (it's called pressure-vessel and is in the SteamLinuxRuntime_soldier/pressure-vessel/bin/ directory), because at the moment it assumes all libraries came from /usr or /lib*.

Installing a version of libdrm.so.2 in /opt that is outside the OS's dependency management system and putting it on the global search path is always going to be a potentially problematic thing to do, because if happens to be older than the one your OS installs in /usr, it will break dependencies in your OS.

Packages from PPAs behave more like replacements for part of the OS - there's still a risk of breaking things, but their structure tends to be a drop-in replacement rather than overlaying extra stuff over the top, and upgrading to a newer base OS will often be able to replace the PPA packages cleanly with the versions from the new base OS.

Lepidos commented 4 years ago

Yes I have this issue. And using amdgpu-pro... :-( Need a step by step guide to solve this on debian.

I tried this LD_LIBRARY_PATH=/opt/amdgpu/lib/i386-linux-gnu:$LD_LIBRARY_PATH steam but no go

coolacid commented 4 years ago

I'm also affected by using amdgpu-pro. Here's some notes, amdgpu-pro does the following on install:

  1. Places it's dri libs in /opt/amdgpu/lib/*/dri
  2. Adds 20-amdgpu.conf into /etc/ld.so.conf.d which adds above directories to the ldcache.
  3. Moves the /usr/lib//dri/_dri.so files to *dri.so~
  4. Adds a symlink from the original file names to the /opt/amdgpu/lib//dri/ file

Terrible work around, Delete the symlinks and ~ files from the /usr/lib dirs, and copy the files from /opt/amdgpu/lib, move the ld.so.conf file out, and run ldconfig command to rebuild the cache.

Alas, I still have problems running Among Us, but I think I'm past this problem, and gives you all some data points.

clintar commented 4 years ago

Holy cow this took a while to find. Hard to figure out amdgpu-pro opencl causes this. I used it for mining, but wow that was a pain

clintar commented 4 years ago

In my case, I installed amdgpu-pro OpenCL by extracting the amdgpu-pro-20.10-1048554-ubuntu-18.04.tar.xz into its own folder /home/clintar/Downloads/amdgpu-pro-20.10-1048554-ubuntu-18.04 i went into there and ran ./amdgpu-install --uninstall to get rid of it in case anyone did the same

HolyBlackCat commented 4 years ago

I had this error when I installed amdgpu (NOT amdgpu-pro, which didn't work for me at all) from the AMD site, not knowing that Ubuntu installs amdgpu by default. Uninstalling the one I installed seemed to make it fall back to the preinstalled amdgpu, and solved the problem.

Some of my confusion was caused by glxinfo reporting "Mesa" as OpenGL vendor instead of "AMD", which I interpreted as a sign of the driver not working. Turns out this is wrong, and the correct way of checking what driver is being used is looking at the output of lspci -v | sed '1,/VGA/d;/^$/,$d'.

smcv commented 4 years ago

[A graphics driver in /opt] will confuse the container runtime system that we use to give Proton 5.13 its required runtime environment (it's called pressure-vessel and is in the SteamLinuxRuntime_soldier/pressure-vessel/bin/ directory), because at the moment it assumes all libraries came from /usr or /lib*.

The next release of pressure-vessel will hopefully fix this bug, as a result of https://gitlab.steamos.cloud/steamrt/steam-runtime-tools/-/merge_requests/173 having been merged.

Installing a version of libdrm.so.2 in /opt that is outside the OS's dependency management system and putting it on the global search path is always going to be a potentially problematic thing to do

This will still be a potentially problematic thing to do, and I wouldn't recommend it; but in future it shouldn't be any worse for pressure-vessel than it already is for your OS.

I had this error when I installed amdgpu (NOT amdgpu-pro, which didn't work for me at all) from the AMD site

Does that also install in /opt? If it does, it is not surprising for it to behave the same as amdgpu-pro.

In general, unlike Windows, it is most reliable to get Linux graphics drivers from the OS vendor (Ubuntu, Fedora, openSUSE, whatever you're using).

If the graphics drivers provided by your OS vendor are too old, using official backports or a PPA-style overlay/addon repository that has been prepared specifically for your OS is likely to be better than something OS-independent. (But only use third-party driver overlay/addon repositories if you completely trust their maintainer: they get unlimited control over your system.)

Some of my confusion was caused by glxinfo reporting "Mesa" as OpenGL vendor instead of "AMD"

This is a consequence of the architecture/design of the open-source driver stack, and is not under the Steam Runtime's control. Mesa is one of the shared frameworks used by all the major open-source graphics drivers. The only non-Mesa-based graphics stack that is likely to work with Steam is the NVIDIA proprietary driver.

TTimo commented 4 years ago

The fix mentioned above is available in soldier 0.20201124.0. See https://steamcommunity.com/app/221410/discussions/2/2962768718547168164/ for details.

TheAquabat commented 3 years ago

I'm running steam linux runtime soldier beta and still hitting this bug, removing amdgpu pro headless libs fixes the none working proton 5.13-2

smcv commented 3 years ago

still hitting this bug, removing amdgpu pro headless libs fixes [it]

@TheAquabat, or anyone else who is still having this problem: Please could you describe the system you are running on, and exactly what "amdgpu pro headless libs" means - where you downloaded them, and how you installed them?

A complete 'System Information' report would be useful.

If you can return the system to the state where launching Proton doesn't work, it would also be really useful to get a log from launching the container. You can do this without the extra complexity of involving Proton like this:

cd /path/to/SteamLinuxRuntime_soldier
PRESSURE_VESSEL_VERBOSE=1 ./run -- steam-runtime-system-info --verbose 2>&1 | tee container.log

and then send container.log as a gist. You can edit/censor the log if there's anything in it that you consider private, as long as it's obvious where it has been edited, for instance replacing your username with REDACTED.

The SteamLinuxRuntime_soldier directory will be in one of your Steam libraries. The most likely place is ~/.local/share/Steam/steamapps/common/SteamLinuxRuntime_soldier if you haven't reconfigured the installation path.

TheAquabat commented 3 years ago

I'm not sure if it is the same bug proton 5.13-2 doesn't work, but Proton 5.21 GE (which also uses pressure vessel does) I will investigate further and report back.

smcv commented 3 years ago

I'm not sure if it is the same bug

If you're not sure, report it separately. Having the full information is useful, and we can close a separate report as a duplicate more easily than we can disentangle two bugs being mixed up in the same issue number.

TheAquabat commented 3 years ago

yes can confirm that removing amdgpu pro libdrm libs fixes the issue but only for Proton 5.21 GE and not for standard proton 5.13-2 When I mean amdgpu pro libdrm libs I mean the driver that you can download from amd.com I use it in headless mode installing the driver like this sudo ./amdgpu-pro-install --opencl=legacy,pal --headless

to use only the OpenCL driver

here's a container https://gist.github.com/TheAquabat/68ad6d803d39170afdc18bb2e75b3f4f

here's steam system info https://gist.github.com/TheAquabat/9ab48978c5666dd5e17688efcc5f78c6

this is my system


NAME="KDE neon"
VERSION="5.20"
ID=ubuntu
ID_LIKE="ubuntu debian"
PRETTY_NAME="KDE neon User Edition 5.20"
VARIANT="User Edition"
VERSION_ID="20.04"
HOME_URL="https://neon.kde.org/"
SUPPORT_URL="https://neon.kde.org/"
BUG_REPORT_URL="https://bugs.kde.org/"
LOGO=start-here-kde-neon
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal
TheAquabat commented 3 years ago

and I also have the amdvlk deb package installed https://github.com/GPUOpen-Drivers/AMDVLK

TheAquabat commented 3 years ago

I think that the problem might be the combination of both amdvlk and the amdgpu pro headless libs... not sure if it is the same bug. Should I open a new bug report?

smcv commented 3 years ago

here's a container [log] https://gist.github.com/TheAquabat/68ad6d803d39170afdc18bb2e75b3f4f

Thanks. This looks better than we had before the last attempt to fix this: we're picking up at least some of the necessary libraries from /opt, which we weren't before.

i386-linux-gnu-check-vulkan: symbol lookup error: /overrides/lib/i386-linux-gnu/vulkan/5/libvulkan_radeon.so: undefined symbol: amdgpu_cs_syncobj_transfer

That symbol is meant to be in libdrm.so.2 or maybe libdrm_amdgpu.so.1, so this looks like we're ending up with a wrong version of libdrm. Apparently the version we're loading now is from /opt/amdgpu-pro/lib/x86_64-linux-gnu.

/overrides/lib/i386-linux-gnu/dri/radeonsi_dri.so: undefined symbol: amdgpu_cs_query_reset_state2

That looks similar.

We also see a similar story for x86_64.

smcv commented 3 years ago

OK, this is weird. In https://gist.github.com/TheAquabat/9ab48978c5666dd5e17688efcc5f78c6 - both when we run directly on the host and when we run in the soldier container - we pick up the complete graphics stack from /usr, which seems to be the good one with newer versions of libraries. In particular, we get both libdrm_amdgpu.so.1 and libOpenCL.so.1 from /usr.

But then when you launch from outside Steam, we're somehow deciding that libdrm_amdgpu.so.1 and libOpenCL.so.1 from /opt/amdgpu-pro are the ones we should use: this means the library search path is being searched in a different order?

Did you get the container log and the system info at the same time; or did you do one, then add or remove workarounds, then do the other?

When I mean amdgpu pro libdrm libs I mean the driver that you can download from amd.com

Please be much more specific than this: I need to see version numbers and URLs if I'm going to have any chance of understanding what is happening here.

smcv commented 3 years ago

Looking at, for example, https://www.amd.com/en/support/graphics/amd-radeon-5700-series/amd-radeon-rx-5700-series/amd-radeon-rx-5700 -> Ubuntu x86 64-Bit -> https://drivers.amd.com/drivers/linux/amdgpu-pro-20.45-1164792-ubuntu-20.04.tar.xz as a random example of something that someone might download:

This is looking a lot like a bug in the amdgpu-pro driver. The driver package claims to have been prepared specifically for Ubuntu 20.04, but it contains libdrm_amdgpu.so.1 version 2.4.100, which is older than the libdrm_amdgpu.so.1 version 2.4.102 in Ubuntu. Driver vendors can't just downgrade libraries and expect the resulting system to continue to work.

Until this is resolved, I would recommend getting AMD drivers from the OS vendor (in your case Ubuntu) instead of directly from AMD.

smcv commented 3 years ago

https://gitlab.steamos.cloud/steamrt/steam-runtime-tools/-/merge_requests/192 might help this, depending what is going on on the affected systems.