Closed tgurr closed 1 month ago
Hello @tgurr, this reads more like a Pressure Vessel issue than an issue with the Steam client, so I've transferred this issue report to the steam-runtime issue tracker. Please give https://github.com/ValveSoftware/steam-runtime/blob/master/doc/reporting-steamlinuxruntime-bugs.md#essential-information a read and share the requested information.
Trying the steps from the mentioned link I couldn't get anything useful out STEAM_LINUX_RUNTIME_VERBOSE=1 steam 2>&1 | tee ~/slr.log
contains next to nothing: slr.log
Launching PRESSURE_VESSEL_VERBOSE=1 STEAM_LINUX_RUNTIME_VERBOSE=1 steam
command line output: https://gist.github.com/tgurr/e6daaf6f0b4a2dd3130189f2aefacf05
Additional stuff: 01_pinned_libs.txt 02_print-steam-runtime-library-paths.txt 03_library-abi.log 04_srsi.log
this reads more like a Pressure Vessel issue than an issue with the Steam client
I'm not convinced, actually - the original report sounded more like a steamwebhelper issue to me.
No, on closer inspection, this is pressure-vessel-related.
STEAM_LINUX_RUNTIME_VERBOSE=1 steam 2>&1 | tee ~/slr.log
contains next to nothing
This is because anything logged by the steamwebhelper
(which is known to be rather verbose, and sometimes misleading) gets redirected to a separate log file.
When investigating any steamwebhelper
crash, please check the log files in ~/.steam/steam/logs/
(or ~/.var/app/com.steampowered.Steam/.steam/steam/logs
if you're using Steam via the unofficial Flatpak app, or ~/snap/steam/common/.steam/steam/logs
if you're using the unofficial Snap app).
In the current public beta version of Steam, the output of SLR while running steamwebhelper
appears in webhelper-linux.txt
, and there are potentialy also relevant messages in cef_log.txt
and webhelper.txt
. Older versions might have logged to steamwebhelper.log
but I think that file is unused now. You might want to exit from Steam completely and move your ~/.steam/steam/logs/
out of the way, so that you can know that everything in that directory is new.
Running Steam as STEAM_LINUX_RUNTIME_VERBOSE=1 steam
is a correct debugging step: that should result in pressure-vessel debug-level output appearing in webhelper-linux.txt
.
There are three new files in the dri directory and the "old" ones now seem to be just symlinks. Maybe that causes issues for the container stuff and how it's handled by pressure vessel (for Exherbo)?
It should be able to dereference the symlinks, but we'd have to see the detailed log to know for sure whether that's working as intended.
src/steamUI/steamuisharedjscontroller.cpp (619) : Failed creating offscreen shared JS context
Unfortunately, I think this might just mean "steamwebhelper
is broken". We'd need to see the messages logged in webhelper-linux.txt
to know whether this is a problem with SLR or with steamwebhelper
.
Some of the messages logged by the steamwebhelper
are known to be misleading: it seems to be normal to get ANGLE and EGL initialization errors, even on an otherwise working system.
Aha! The original issue report has a webhelper-linux.txt
, which doesn't have verbose SLR output, but does have relevant error messages.
There are lots of warnings like this:
x86_64-linux-gnu-capsule-capture-libs: warning: Dependencies of libGLX_mesa.so.0 not found, ignoring: Missing dependencies: Could not find "libgallium.so" in LD_LIBRARY_PATH "/home/tgurr/.local/share/Steam/ubuntu12_32:/home/tgurr/.local/share/Steam/ubuntu12_32/panorama:/usr/x86_64-pc-linux-gnu/lib:/usr/local/lib:/usr/x86_64-pc-linux-gnu/lib/nss:/usr/x86_64-pc-linux-gnu/lib/qt5:/usr/x86_64-pc-linux-gnu/lib/qt6:/usr/i686-pc-linux-gnu/lib:/usr/local/lib", ld.so.cache, DT_RUNPATH or fallback /lib:/usr/lib
and then the steamwebhelper
doesn't start either:
./steamwebhelper: error while loading shared libraries: libgallium.so: cannot open shared object file: No such file or directory
which is probably the root cause for what you're seeing.
We'll need to figure out how your libGLX_mesa.so.0
is loading libgallium.so
successfully on your host system, but not when the SLR container infrastructure tries to find it. @tgurr, please inspect one of the affected libraries like libGLX_mesa.so.0
with a command like objdump -T -x /usr/x86_64-pc-linux-gnu/lib/libGLX_mesa.so.0
, and copy/paste the section of the output headed Dynamic Section:
here?
@tgurr or @kisak-valve, it would maybe be helpful to retitle this issue to mention ./steamwebhelper: error while loading shared libraries: libgallium.so
.
It would also be interesting if you could try running:
LD_DEBUG=libs,files glxgears 2>&1 | tee glxgears.log
and provide glxgears.log
as an attachment or Gist, so that we can see how a simple OpenGL program like glxgears
manages to find its libraries on this particular system.
As usual huge thanks! I'll provide the requested information later on once I got back from work. From what I understood also from the conversation in the mesa bugtracker we could maybe also install an environment file to extend the LDPATH
for the non-default search path (${libdir}/dri
) with:
/etc/env.d/99mesa
containing LDPATH=/usr/@TARGET@/lib/dri
and installing that with our mesa package on a distribution level? Can't judge if that would be a proper solution or rather a workaround though. Granted it would actually work in the first place.
There is no such thing as LDPATH
(that I'm aware of), do you mean LD_LIBRARY_PATH
?
My current understanding of the situation (which could be wrong, I'll need to see debug information) is that if everything is working correctly, you should not need to set LD_LIBRARY_PATH
, because the Mesa-related libraries should be able to find libgallium.so
as referenced by their DT_RUNPATH
ELF headers; and adding /usr/@TARGET@/lib/dri
to the LD_LIBRARY_PATH
might even be harmful, by making components load mismatched versions of their dependencies. So I would prefer it if Exherbo doesn't need to do that.
A build log from the way Exherbo builds Mesa would be useful information, if you can easily obtain it.
objdump -T -x /usr/x86_64-pc-linux-gnu/lib/libGLX_mesa.so.0
https://gist.github.com/tgurr/c4d9877af69ccbe3c46889ec5bab7165
LD_DEBUG=libs,files glxgears 2>&1 | tee glxgears.log
https://gist.github.com/tgurr/7c610adfbfbbb6ae8031f6965ce69321
A build log from the way Exherbo builds Mesa would be useful information, if you can easily obtain it.
x86_64: https://gist.githubusercontent.com/tgurr/73a69c6414dc424517d3eb0693ffffe1/raw/8046bf3b420e6dd455783f93d3af5f9f1adf0d33/gistfile1.txt x86: https://gist.githubusercontent.com/tgurr/c07653ad03c2be9c253876b6da90bb68/raw/0bc242ce0c7b4875aa1236bb494a80df8ef2bb7d/gistfile1.txt
Please let me know if I missed something and/or if you need further details I'm able to provide.
objdump -T -x /usr/x86_64-pc-linux-gnu/lib/libGLX_mesa.so.0
https://gist.github.com/tgurr/c4d9877af69ccbe3c46889ec5bab7165
Oh no. I can see why this is not working:
Dynamic Section:
...
NEEDED libgallium.so
...
RPATH /usr/x86_64-pc-linux-gnu/lib/dri
That's the legacy DT_RPATH
, not the more modern DT_RUNPATH
(see ld.so(8)
for what the difference is).
libcapsule (and therefore pressure-vessel, and therefore SLR) doesn't support DT_RPATH
, only DT_RUNPATH
. This limitation is because the semantics of DT_RPATH
are really annoying to implement (it has an "action at a distance" behaviour that affects the entire dependency tree).
Does Exherbo do something in its toolchain to avoid DT_RUNPATH
and go back to the older DT_RPATH
?
In most distros (e.g. Debian), linking with Meson install_rpath
results in linking with -Wl,-rpath,...
which actually generates a DT_RUNPATH
, unless the linker flags also include -Wl,--disable-new-dtags
.
I don't see an explicit --disable-new-dtags
or --enable-new-dtags
in your build log, so presumably you're getting your linker's default behaviour.
Could this perhaps be because Debian configures binutils
with ./configure --enable-new-dtags
(and so do other distros like Fedora and Arch), but Exherbo does not?
Checked our binutils and we (yet) don't pass any specific--disable-new-dtags
or --enable-new-dtags
to it so it uses the defaults of current 2.42
. After adding it and recompiling binutils and mesa steam indeed works again!
I'll have to check if we can simply add the --enable-new-dtags
to our binutils and will for now try to explicitly pass it to our mesa package.
The possible routes to get a DT_RUNPATH
would be:
./configure --enable-new-dtags
, like Debian/Fedora/Arch do; and then use that binutils to recompile Mesa-Wl,--enable-new-dtags
added to the linker flags (LDFLAGS
or Meson c_link_args
or equivalent)It's a bit confusing - there are two options with the same name and basically the same effect, in two different places.
There are a couple of reasons why most other distributions have moved to DT_RUNPATH
("new dtags") and away from DT_RPATH
.
DT_RPATH
is higher-precedence than $LD_LIBRARY_PATH
and causes $LD_LIBRARY_PATH
to be ignored, which often comes as a surprise to users and developers who are trying to use $LD_LIBRARY_PATH
to substitute a newer version of a library, and similarly can be harmful to portability frameworks like the (older LD_LIBRARY_PATH
-based version of the) Steam Runtime.
DT_RPATH
also has the "action at a distance" semantics that I mentioned earlier, where the DT_RPATH
on the main executable or on library A can affect the search path that's used for the dependencies of library B, which is not always what's wanted.
I've implemented a package-specific workaround for our mesa >= 24.2.0-rc1: https://gitlab.exherbo.org/exherbo/x11/-/merge_requests/858 and proposed to change our binutils defaults moving away from legacy stuff (https://gitlab.exherbo.org/exherbo/arbor/-/merge_requests/4119), probably since it's not the default of binutils even of recent versions noone looked into this yet and it may be just an oversight we didn't move on with this as well.
Again I can't tell you how grateful I am for your help and willingness to do so, I'm pretty sure I wouldn't have been able to figure this out on my own and if things led to improving the distribution as a whole even better. You're really a person to rely on and you've always been helpful to figure out stuff and get things going in the first place and times like these where things break. Thanks for taking the time and jumping in to provide such a support that can not be taken for granted. I hope the other things you've mentioned in the mesa bugreport which kind of started a discussion will also result in further improvements for everyone.
Again thank you from the deepest of my heart!
In the short term, https://gitlab.exherbo.org/exherbo/x11/-/merge_requests/858 looks like a good solution for Exherbo. If there are other distributions where binutils still defaults to --disable-new-dtags
, then an equivalent issue would exist in those distributions if they upgrade their Mesa to 24.2.0-rc1, and a change equivalent to !858 would be an equally good short-term solution for those other distros.
In the medium term, if https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30328 gets merged before the 24.2.0 stable release (preferably before 24.2.0-rc2), I believe it would avoid this issue completely.
As a long-term improvement, https://gitlab.exherbo.org/exherbo/arbor/-/merge_requests/4119 looks like a positive change, and I would encourage non-Exherbo distributions to do similarly if they haven't already.
I know that at least Arch, Debian/Ubuntu, Fedora and Gentoo default to --enable-new-dtags
already. RHEL might still default to --disable-new-dtags
if I'm reading its specfile correctly (but I didn't look very closely). For other distros (e.g. openSUSE) I don't know the situation, but I suspect that many of them default to --enable-new-dtags
.
In the short term, https://gitlab.exherbo.org/exherbo/x11/-/merge_requests/858 looks like a good solution for Exherbo. If there are other distributions where binutils still defaults to
--disable-new-dtags
, then an equivalent issue would exist in those distributions if they upgrade their Mesa to 24.2.0-rc1, and a change equivalent to !858 would be an equally good short-term solution for those other distros.In the medium term, if https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30328 gets merged before the 24.2.0 stable release (preferably before 24.2.0-rc2), I believe it would avoid this issue completely.
Small additional note since the mentioned merge request landed in mesa main now (as you probably know with your comment to suggest cherry-picking it to staging/24.2 as well) with https://gitlab.freedesktop.org/mesa/mesa/-/commit/9b7bb6cc9fa410fb783e7a99d9eadcc31668f298 I've now replaced our workaround by applying the mentioned commit https://gitlab.exherbo.org/exherbo/x11/-/merge_requests/859 to our 24.2.0-rc1 package instead.
I believe updating to Mesa 24.2.0-rc2 should resolve this. If so, I think we can close the issue - I don't think running versions of OS components that are prereleases and also not up to date is a major use-case for Steam.
Second that, Mesa 24.2.0-rc2 includes the upstream fix, I consider the issue resolved as well and thanks to you it didn't hit any stable Mesa release and resulted in an upstream fix not only fixing the issue on Exherbo but probably having additional benefits for everyone.
I consider the issue resolved as well
Please could you close the issue, then?
(As the issue submitter, you are allowed to close it; and as a moderator, @kisak-valve is allowed to close it; but I can't.)
Your system information
Please describe your issue in as much detail as possible:
The Steam client is not starting anymore and shows the following error message:
From command line output:
Command line output: https://gist.github.com/tgurr/f293c2a6887054ed914414b1da9206a0
Steps for reproducing this issue:
Additional information
mesa bugreport: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11544
There are three new files in the dri directory:
and the "old" ones now seem to be just symlinks. Maybe that causes issues for the container stuff and how it's handled by pressure vessel (for Exherbo)?
mesa-24.1.4:
mesa-24.2.0-rc1 & git main:
Note that for mesa 24.2.0-rc1 there's a spurious error message popping up
MESA-LOADER: failed to open radeonsi: driver not built!)
which is fixed in git main with https://gitlab.freedesktop.org/mesa/mesa/-/commit/159a3edd80a988dec263708f851ed35eec881a78 applying that patch to 24.2.0-rc1 however didn't change the outcome, I first thought maybe the error message confuses steam in any way but apparently the issue is not that easy to solve.Also not sure if relevant or not in this case but a smiliar change happened for the file in the vdpau directory:
mesa-24.1.4:
mesa-24.2.0-rc1 & git main: