ValveSoftware / steam-runtime

A runtime environment for Steam applications
Other
1.18k stars 86 forks source link

Graphics drivers not loaded in Slackware -current: libncurses.so.6: cannot open shared object file: No such file or directory #668

Closed gbschenkel closed 5 months ago

gbschenkel commented 5 months ago

Hi, I know the report is a bit awkward, but I have tried few things and couldn't find anything. I played Manor Lords on 26/04, then during the week I upgrade my system and tried play on 29/04, and saw this screenshot showing up. image Tried to revert mesa and kernel-firmware but I still having this issue.

Mostly games using proton it show the windows above, or this one below. image If I tried to play any native game, it work, as example The Talos Principle. image I tried revert packages and remove all additional commands to start the game, as MangoHud or vkBasalt, but still appears none related. I have tried Proton 9.0-beta(during week), Proton 9.0(stable, today), Proton 8.0 and Proton Experimental.

I am adding the log running steam --debug steam.log and my system information. steam-runtime-system-information.txt

I will try revert the others packages, like python scripts, but doesn't appears they are related, but I will keep this post updated if I find anything.

kisak-valve commented 5 months ago

Hello @gbschenkel, looking at the extended diagnostics information, it looks like 64 bit mesa (both Vulkan and OpenGL is broken inside the Steam Linux Runtime - Sniper container which is used by Proton. I've gone ahead and transferred this issue report to the steam-runtime issue tracker so that it's easier for a runtime dev to ponder this with you.

kisak-valve commented 5 months ago

Please share the information requested at https://github.com/ValveSoftware/steam-runtime/blob/master/doc/reporting-steamlinuxruntime-bugs.md#essential-information to make it easier for a runtime dev to ponder what's happening on your system.

smcv commented 5 months ago

I see that you are using "Slackware 15.0 x86_64 (post 15.0 -current)". Do I understand correctly that this is an unstable development rolling release, analogous to Debian testing/unstable, Fedora rawhide, Arch Linux and other development distributions - as opposed to stable releases like Slackware 15.0, Debian 12 or Fedora 40?

This seems to be a compatibility problem involving libncurses.so.6 and/or libncursesw.so.6. We find your GL driver successfully, but it can't be loaded:

            "MESA-LOADER: failed to open /usr/lib/pressure-vessel/overrides/lib/x86_64-linux-gnu/dri/radeonsi_dri.so: libncurses.so.6: cannot open shared object file: No such file or directory",

and the same is probably happening for Vulkan.

This seems unexpected to me:

        "overrides/lib/x86_64-linux-gnu/libncurses.so.6 -> /run/host/lib64/libncursesw.so.6.5",

... because normally, libncurses.so.6 (ncurses with only "narrow" character support) and libncursesw.so.6 (ncurses with "wide" character support) are two incompatible ABIs, which get installed side-by-side and cannot validly be mixed.

For comparison, on my Debian system, I have libncurses.so.6 -> libncurses.so.6.4 (the "narrow" version), and, separately, libncursesw.so.6 -> libncursesw.so.6.4 (the "wide" version).

And, if I run:

objdump -T -x /lib/x86_64-linux-gnu/libncurses.so.6.* | grep SONAME
objdump -T -x /lib/x86_64-linux-gnu/libncursesw.so.6.* | grep SONAME
objdump -T -x /lib/i386-linux-gnu/libncurses.so.6.* | grep SONAME
objdump -T -x /lib/i386-linux-gnu/libncursesw.so.6.* | grep SONAME

then I find that the internal machine-readable name (SONAME) of libncurses.so.6.* is libncurses.so.6, and similarly the internal name of libncursesw.so.6.* is libncursesw.so.6. On your system, the equivalents would be /lib64/libncurses.so.6.*, /lib64/libncurses.so.6.*, /lib/libncurses.so.6.* and /lib/libncurses.so.6.*.

Similarly, if I look at the output of objdump without the grep, I see that libncurses.so.* exports symbol versions with names like NCURSES6_TINFO_6.2.20200212, whereas libncursesw.so.* exports symbol versions with names like NCURSESW6_6.2.20200212 (notice the extra W).

From the diagnostic information you've attached, it looks as though these two libraries have somehow been mixed up on your system. Have you been installing your own ncurses libraries from source code? If yes, then I think you may have inadvertently made them incompatible with what they should be; or if no, then your OS distributor (Slackware) might have done the same.

In the diagnostic log, I'm also surprised to see that you see to have 32-bit ncurses version 6.4, 64-bit ncurses version 6.5, 32-bit libtinfo version 6.5, and 64-bit libtinfo version 6.5. I would have expected all of these libraries to be at the same version, and I would especially have expected the 32-bit ncurses and libtinfo to match up with each other.

gbschenkel commented 5 months ago

Sorry about the delay, I live in Porto Alegre-RS/Brazil and we are having a major flood because the torrential rains happening in here.

steam-1016800.log steam-runtime-system-info-1714749882.txt slr-app1016800-t20240503T084208.log(Hades) slr-app1145360-t20240503T083711.log(Chernobylite)

gbschenkel commented 5 months ago

In the diagnostic log, I'm also surprised to see that you see to have 32-bit ncurses version 6.4, 64-bit ncurses version 6.5, 32-bit libtinfo version 6.5, and 64-bit libtinfo version 6.5. I would have expected all of these libraries to be at the same version, and I would especially have expected the 32-bit ncurses and libtinfo to match up with each other.

From what I remember I need Multilib(32bits) library be installed to use Steam, only using it because Steam, then since Multilib isn't a official thing in Slackware, we use a packages from AlienBob, an Slackware's developers, and since it doesn't maintain it fully updated with the -current mainstream, I had to upgrade myself this packages.

I just checked, the changelog and saw libncurses got upgraded.

Mon Apr 29 21:32:37 UTC 2024
a/aaa_libraries-15.1-x86_64-31.txz:  Rebuilt.
  Added: libncurses++w.so.6.5, libtic.so.6.5.
  Upgraded: libformw.so.6.5, libmenuw.so.6.5, libncursesw.so.6.5,
  libpanelw.so.6.5, libtinfo.so.6.5.
  Removed (with compat symlinks made): libform.so.6.4, libmenu.so.6.4,
  libncurses.so.6.4, libpanel.so.6.4.
l/ncurses-6.5-x86_64-1.txz:  Upgraded.
  This seemed like a good opportunity to go over my notes and try to make this
  SlackBuild at least defensible, if not correct. :-) The non-wide libraries
  have all been purged and replaced with compatibility symlinks pointing to the
  wide versions. Anything trying to use -lncurses (etc) will be redirected to
  -lncursesw (etc) at compile time. Looks like nearly 50 packages are linked to
  the non-wide libraries, but everything works this way.
  Thanks to GazL who provided most of the suggestions used.

I will unpack this packages and check how are they linked inside, soon I check, I will give the feedback.

smcv commented 5 months ago

we are having a major flood

Of course please prioritize whatever you need to do to deal with your local disasters, making Steam run on Slackware can wait :-)

The non-wide libraries have all been purged and replaced with compatibility symlinks pointing to the wide versions

This seems suspicious. As far as I'm aware, the reason they have different names is that their ABIs are different and incompatible - each application and each dependent library must be compiled to use either "narrow" or "wide" ncurses, and whichever one it's using, it would be a bug for that same binary to load the other one.

Has there been a similar transition in "real Slackware" for 64-bit libraries?

The way this is handled in other distribution families like Debian and Red Hat is that ncurses is compiled twice: once with the "narrow" ABI and SONAME libncurses.so.6, and once with the "wide" ABI and SONAME libncursesw.so.6. Ideally everything would be built against libncursesw.so.6, at which point the distribution could consider removing libncurses.so.6 altogether. But, if my information about the ABI is correct, having libncurses.so.6 and giving it the "wide" ABI is just wrong - that will make its ABI in Slackware incompatible with every other distribution.

I would suggest that Slackware users should talk to the distribution developers about this, and query whether they are intentionally making Slackware incompatible with the rest of the Linux ecosystem (hopefully they are not).

The error message failed to open .../radeonsi_dri.so: libncurses.so.6: cannot open shared object file: No such file or directory is because when the Steam Linux Runtime container framework imports the dependencies of your graphics drivers into its container, it normally only creates a symbolic link for their "official" SONAME, which in this case is probably libncursesw.so.6. We have some code to handle libraries that need to be available under more than one name (we call that "aliases") but it's a lot less reliable than using the correct SONAME, so we only use it in narrowly targeted situations where it's known not to be a compatibility break.

If Slackware is intentionally doing a transition from "some libraries use the narrow ABI, others use the wide ABI" to "everything is using the wide ABI", then this will hopefully resolve itself automatically the next time Mesa is recompiled against the new ABI: after that, objdump -T -x on radeonsi_dri.so will hopefully say NEEDED libncursesw.so.6 instead of the current NEEDED libncurses.so.6.

From what I remember I need Multilib(32bits) library be installed to use Steam

Yes, you do. Parts of Steam itself are 32-bit, and so are many older games.

smcv commented 5 months ago

@kisak-valve, please could you retitle this to something that indicates the scope of this issue? Perhaps something like

Graphics drivers not loaded in Slackware -current: libncurses.so.6: cannot open shared object file: No such file or directory"
pprkut commented 5 months ago

(representing Slackware)

@gbschenkel There's an update for the "official" multilib package for ncurses on the way. I think a good first step would be to update to that one once it's available. Probably good to re-install the 64bit ncurses package after too, just to make sure your custom multilib package didn't accidentally alter some files of that one.

smcv commented 5 months ago

To put this in distro-maintainer-suitable terms: whenever we encounter this situation:

that's an inconsistency that is going to cause a problem for the Steam Linux Runtime. The solution is for the distro to recompile the dependent package against its new dependency, which should (hopefully automatically) result in it getting a DT_NEEDED on the new "official" name of its dependency, libnewname.so.1.

In this specific situation it looks like liboldname.so.0 is libncurses.so.6, and libnewname.so.1 is libncursesw.so.6.

Situations like this do sometimes exist for a brief transitional period in rolling-release distributions, but ideally for as short a time as possible, and ideally for as few libraries as possible.

The other situations that we would really like distros to avoid, which it seems might also be happening here if I'm understanding correctly, are:

and:

My motto for this is "names are ABIs, and ABIs are names": it's fine for distros and libraries to evolve over time and move from one interface to another, but if the new interface is not backward-compatible, then it should have a new name so that everyone can distinguish it from the old interface.

We can (and do!) work around most incompatibilities of this type by using containers, and that works most of the time, but the one place it doesn't work is if the library is part of the dependency stack of the graphics driver (from Mesa/Nvidia/whatever all the way down to glibc). That's because we one of our design assumptions is that we have to use the host system's graphics drivers, on the basis that those are the only thing that is guaranteed to work for this user's particular GPU.

gbschenkel commented 5 months ago

Hi, other Slackware user posted on LinuxQuestion he was able to play after libedit was rebuilted. I done the same, and was able to launch Hades now, but not Manor Lord or Chernobylite, I am getting this message now: image image

First I only rebuilted 64bits package, then I saw other user saying he rebuilt the 32bits and converted to -compat32 to be used in multilib. Done the same but the screen above still show up.

I wasn't able to rebuild Vulkan because some issue related while building Vulkan Caps Viewer.

PS: I reinstalled all slackware packages, removed my self-built multilib packages, and installed the newest from the AlienBob, just to have a "standard" Slackware64 15.0 -current distro. Only customization now is libedit package.

garpu commented 5 months ago

Another slackware-current user. :D

runtime diagnostics: https://gist.github.com/garpu/5d5481e54cf50f4f215e9ba646aa1fd7 a log to GW2 crashing: https://gist.github.com/garpu/9e723152c7e1426221ac36d9374b5986

I've got libedit and libedit-32 compiled against the April 29 version of ncurses, and I'm all up to date on updates as of this morning.

I can run Diablo IV, but it throws an error that my GPU is unsupported. Here's that log: https://gist.github.com/garpu/2bede1ec28ae53994858d9961c88f9c3

Palworld (another data point) will load, but then I get the UI loading and a black screen. https://gist.github.com/garpu/92bed895d6688492b1fd198ed95bd4ac

gbschenkel commented 5 months ago

Hi, I think the problem is now resolved, but will close this Issue the next day. Slackware update Sat May 4 17:37:11 UTC 2024 and Multilib update Sun May 5 08:41:30 UTC 2024 appears resolved all Issues.

I was able to launch Manor Lords and load my save game, didn't spend time trying to play. Chernobylite is crashing after the launcher, but I think is related to Proton 9, need try use Proton 8 or Experimental, but I can't right now.

garpu commented 5 months ago

Guild wars 2 (crashing yesterday) is working today. No Man's Sky is still crashing. Diablo IV still reports that my video card is unsupported but loads and shows RADV/NAVI33 under graphics. Palworld is still crashing. I don't own Chernobylite, or I'd test that.

I'm not sure if my issues are the ncurses issue here or something else, since I just got (and installed) this video card on May 3.

For No Man's Sky, I"ve launched it with "AMD_VULKAN_ICD=RADV %command%" Here's a gist of its log: https://gist.github.com/garpu/20aff38b067124a9ed5b51d0537e874c Runtime log for NMS: https://gist.github.com/garpu/d5c0f1b72af73a5b62393752a1956d62

I'm launching No Man's Sky with "AMD_VULKAN_ICD=RADV %command%" I've also deleted and recreated the prefix, and tried proton 8.0-5.

kisak-valve commented 5 months ago

Hello @garpu, your runtime diagnostics information shows that you do not have the same summarily broken mesa install within the Steam Linux Runtime - Sniper container environment, and that makes your issue unrelated to this issue report.

It's more likely your issue is https://gitlab.freedesktop.org/drm/amd/-/issues/3343.

garpu commented 5 months ago

Yep, that's my issue. @gbschenkel is also on current, so should also be on 6.6.30... New card works really well now. :P

gbschenkel commented 5 months ago

Okay, I have tested and Chernobylite need downgrade Proton to start it, downgrade to 6, since this is a game specific problem, I not gonna continue persuade on it.

@garpu I tried and was able to play NMS but I needed to restrict the CPU core available, WINE_CPU_TOPOLOGY=14:0,1,2,3,4,5,6,7,8,9,10,11,12,13 %command%

gbschenkel commented 5 months ago

Well the problem was neither with WINE_CPU_TOPOLOGY or Proton version. @garpu have mention on LinuxQuestion about a patch on kernel 6.6.30 which cause issues when Above 4G Decoding is not working, which appears be my case, besides it was already turned on on BIOS. I rollback kernel to 6.6.29 and Chernobylite started to work with Proton 9, also corrected some glitches on No Man Sky which didn't show up in the first run I did, only in the second and so on it appears.