Closed SuperSandro2000 closed 1 month ago
Please could you provide a full log, not just snippets of a log, and not just your interpretation of a log? To make pressure-vessel work on unusual platforms like NixOS, we really need to be able to see the whole picture. If you jump directly to the aspect that you think is wrong, there's a serious risk that we are looking at a symptom and not at the real problem.
On https://github.com/NixOS/nixpkgs/pull/330468 you mentioned "yet another bug which I could fix by verifiying the runtime files", and other evidence of filesystem corruption. Before going any further, please check that the runtime you are trying to use is intact. You can do this with:
~/.local/share/Steam/ubuntu12_64/steam-runtime-sniper/pressure-vessel/bin/pv-verify
I think pressure-vessel assumes that the symlink it finds inside steam-run is already in its own container while it is in steam-runs container. I found a code piece which would fit to my assumption well enough for me to believe it
Sorry, no, you're misinterpreting that comment. By the time we get to that point, the symlink we are looking at is one that was created by pressure-vessel. If it's intended to point to /lib/foo
in NixOS' "FHS env", then the symlink target that we're looking at in that function will actually be more like /run/host/lib/foo
.
(But actually we do a realpath()
on it, resolving symlinks, so it will probably actually be /nix/something
which you'll notice is not in exclude_prefixes
.)
And, yes, this is all very confusing: the pressure-vessel codebase works with absolute paths in two (or occasionally three) different "worlds", and we have to keep very careful track of which "world" we are in at any given time. NixOS' "FHS env" adds one more "world" to that, making it even harder to keep track of. If you attach a full log, we can look into it.
Please could you provide a full log, not just snippets of a log, and not just your interpretation of a log?
It would be nice, if the logging by default would not leak secrets in the environment like GITHUB_TOKEN.
./result/bin/steam-run ~/.local/share/Steam/ubuntu12_64/steam-runtime-sniper/run --verbose -- ldconfig -p |& gh gist create -
https://gist.github.com/SuperSandro2000/9486300b27d74f17d9df08196460b0b2
"yet another bug which I could fix by verifiying the runtime files", and other evidence of filesystem corruption. Before going any further, please check that the runtime you are trying to use is intact. You can do this with:
I copied my steam library from flatpak into the normal path. I assume something went wrong there. Now things are like they should.
➜ ./result/bin/steam-run ~/.local/share/Steam/ubuntu12_64/steam-runtime-sniper/pressure-vessel/bin/pv-verify
pv-verify[74562]: N: Verified "/home/sandro/.local/share/Steam/ubuntu12_64/steam-runtime-sniper/sniper_platform_0.20240423.85483/files" against "/home/sandro/.local/share/Steam/ubuntu12_64/steam-runtime-sniper/sniper_platform_0.20240423.85483/usr-mtree.txt.gz" successfully
pv-verify[74562]: N: Verified "/home/sandro/.local/share/Steam/ubuntu12_64/steam-runtime-sniper" against "/home/sandro/.local/share/Steam/ubuntu12_64/steam-runtime-sniper/mtree.txt.gz" successfully
If it's intended to point to
/lib/foo
in NixOS' "FHS env", then the symlink target that we're looking at in that function will actually be more like/run/host/lib/foo
.(But actually we do a
realpath()
on it, resolving symlinks, so it will probably actually be/nix/something
which you'll notice is not inexclude_prefixes
.)
Maybe something is off there? When I applied NixOS/nixpkgs@806b4f8
(#330468) and NixOS/nixpkgs@f059494
(#330468) which basically turned all our symlinks in our fhs into the actual files (which also infalted the size of the FHS to almost 6GB) things started to work.
For any other library like libXext.so.6 it seems to work
➜ ./result/bin/steam-run ~/.local/share/Steam/ubuntu12_64/steam-runtime-sniper/run --verbose -- ldconfig -p |& rg libXext.so.6 -C 1
pressure-vessel-wrap[77620]: D: Deleting tmp-*/lib/x86_64-linux-gnu/libxml2.so.2.9.10 because overrides/lib/x86_64-linux-gnu/libxml2.so.2 replaces it
pressure-vessel-wrap[77620]: D: Deleting tmp-*/lib/x86_64-linux-gnu/libXext.so.6.4.0 because overrides/lib/x86_64-linux-gnu/libXext.so.6 replaces it
pressure-vessel-wrap[77620]: D: Deleting tmp-*/lib/x86_64-linux-gnu/libva-drm.so.2 because overrides/lib/x86_64-linux-gnu/libva-drm.so.2 replaces it
--
pressure-vessel-wrap[77620]: D: Deleting tmp-*/lib/x86_64-linux-gnu/libdrm_amdgpu.so.1 because overrides/lib/x86_64-linux-gnu/libdrm_amdgpu.so.1 replaces it
pressure-vessel-wrap[77620]: D: Deleting tmp-*/lib/x86_64-linux-gnu/libXext.so.6 because overrides/lib/x86_64-linux-gnu/libXext.so.6 replaces it
pressure-vessel-wrap[77620]: D: Deleting tmp-*/lib/x86_64-linux-gnu/libX11-xcb.so.1.0.0 because overrides/lib/x86_64-linux-gnu/libX11-xcb.so.1 replaces it
--
pressure-vessel-wrap[77620]: D: Deleting tmp-*/lib/i386-linux-gnu/libxml2.so.2.9.10 because overrides/lib/i386-linux-gnu/libxml2.so.2 replaces it
pressure-vessel-wrap[77620]: D: Deleting tmp-*/lib/i386-linux-gnu/libXext.so.6.4.0 because overrides/lib/i386-linux-gnu/libXext.so.6 replaces it
pressure-vessel-wrap[77620]: D: Deleting tmp-*/lib/i386-linux-gnu/libva-drm.so.2 because overrides/lib/i386-linux-gnu/libva-drm.so.2 replaces it
--
pressure-vessel-wrap[77620]: D: Deleting tmp-*/lib/i386-linux-gnu/libdrm_amdgpu.so.1 because overrides/lib/i386-linux-gnu/libdrm_amdgpu.so.1 replaces it
pressure-vessel-wrap[77620]: D: Deleting tmp-*/lib/i386-linux-gnu/libXext.so.6 because overrides/lib/i386-linux-gnu/libXext.so.6 replaces it
pressure-vessel-wrap[77620]: D: Deleting tmp-*/lib/i386-linux-gnu/libGLX.so.0.0.0 because overrides/lib/i386-linux-gnu/libGLX.so.0 replaces it
--
pressure-vessel-wrap[77620]: D: Will export read-only: /nix/store/xfs2pbni0s2j1nlhfpvvsp6pc2xym6a4-libxcb-1.17.0/lib/libxcb-sync.so.1.0.0
pressure-vessel-wrap[77620]: D: Exporting /nix/store/34ggaz0f3s1q39nq6jk44cjw0id2vvnx-libXext-1.3.6/lib/libXext.so.6.4.0 because overrides/lib/i386-linux-gnu/libXext.so.6 points to it
pressure-vessel-wrap[77620]: D: Trying to export read-only: /nix/store/34ggaz0f3s1q39nq6jk44cjw0id2vvnx-libXext-1.3.6/lib/libXext.so.6.4.0
pressure-vessel-wrap[77620]: D: /nix is not a symlink
--
pressure-vessel-wrap[77620]: D: /nix/store/34ggaz0f3s1q39nq6jk44cjw0id2vvnx-libXext-1.3.6/lib is not a symlink
pressure-vessel-wrap[77620]: D: /nix/store/34ggaz0f3s1q39nq6jk44cjw0id2vvnx-libXext-1.3.6/lib/libXext.so.6.4.0 is not a symlink
pressure-vessel-wrap[77620]: D: Will export read-only: /nix/store/34ggaz0f3s1q39nq6jk44cjw0id2vvnx-libXext-1.3.6/lib/libXext.so.6.4.0
pressure-vessel-wrap[77620]: D: Exporting /nix/store/xfs2pbni0s2j1nlhfpvvsp6pc2xym6a4-libxcb-1.17.0/lib/libxcb-shm.so.0.0.0 because overrides/lib/i386-linux-gnu/libxcb-shm.so.0 points to it
--
pressure-vessel-wrap[77620]: D: Will export read-only: /nix/store/vf553z7mi2vqk8ca6kkfd9x5gy3nnz0p-libunistring-1.1/lib/libunistring.so.5.0.0
pressure-vessel-wrap[77620]: D: Exporting /nix/store/a753y1mlbf39xkhifx01dpxczgcjx01h-libXext-1.3.6/lib/libXext.so.6.4.0 because overrides/lib/x86_64-linux-gnu/libXext.so.6 points to it
pressure-vessel-wrap[77620]: D: Trying to export read-only: /nix/store/a753y1mlbf39xkhifx01dpxczgcjx01h-libXext-1.3.6/lib/libXext.so.6.4.0
pressure-vessel-wrap[77620]: D: /nix is not a symlink
--
pressure-vessel-wrap[77620]: D: /nix/store/a753y1mlbf39xkhifx01dpxczgcjx01h-libXext-1.3.6/lib is not a symlink
pressure-vessel-wrap[77620]: D: /nix/store/a753y1mlbf39xkhifx01dpxczgcjx01h-libXext-1.3.6/lib/libXext.so.6.4.0 is not a symlink
pressure-vessel-wrap[77620]: D: Will export read-only: /nix/store/a753y1mlbf39xkhifx01dpxczgcjx01h-libXext-1.3.6/lib/libXext.so.6.4.0
pressure-vessel-wrap[77620]: D: Exporting /nix/store/hlmrl1k9smf9xrbsq97cgbkv10176iab-libxcb-1.17.0/lib/libxcb-randr.so.0.1.0 because overrides/lib/x86_64-linux-gnu/libxcb-randr.so.0 points to it
--
pressure-vessel-wrap[77620]: D: "/nix/store/34bb7cb2p5lxqgm9irrrqlp17s23hww2-libva-1.8.3/lib/libva-x11.so.1.4000.0" is meant to be shared (ro or rw) with the container
pressure-vessel-wrap[77620]: D: "/nix/store/34ggaz0f3s1q39nq6jk44cjw0id2vvnx-libXext-1.3.6/lib/libXext.so.6.4.0" is meant to be shared (ro or rw) with the container
pressure-vessel-wrap[77620]: D: "/nix/store/3f1v43h6m9k9krhch4qszr85d6ami3l5-graphics-drivers-32bit/lib/libEGL_mesa.so.0" is meant to be a symlink
--
pressure-vessel-wrap[77620]: D: "/nix/store/a1370wggay4293caa9z82dc5dazmyg5a-ncurses-6.4.20221231/lib/libncursesw.so.6.4" is meant to be shared (ro or rw) with the container
pressure-vessel-wrap[77620]: D: "/nix/store/a753y1mlbf39xkhifx01dpxczgcjx01h-libXext-1.3.6/lib/libXext.so.6.4.0" is meant to be shared (ro or rw) with the container
pressure-vessel-wrap[77620]: D: "/nix/store/b3sjqm35cp6yi6j5cddb893cigzcpcln-libX11-1.8.9/lib/libX11-xcb.so.1.0.0" is meant to be shared (ro or rw) with the container
--
pressure-vessel-wrap[77620]: D: --ro-bind /nix/store/34bb7cb2p5lxqgm9irrrqlp17s23hww2-libva-1.8.3/lib/libva-x11.so.1.4000.0 /nix/store/34bb7cb2p5lxqgm9irrrqlp17s23hww2-libva-1.8.3/lib/libva-x11.so.1.4000.0
pressure-vessel-wrap[77620]: D: --ro-bind /nix/store/34ggaz0f3s1q39nq6jk44cjw0id2vvnx-libXext-1.3.6/lib/libXext.so.6.4.0 /nix/store/34ggaz0f3s1q39nq6jk44cjw0id2vvnx-libXext-1.3.6/lib/libXext.so.6.4.0
pressure-vessel-wrap[77620]: D: --bind /nix/store/3f1v43h6m9k9krhch4qszr85d6ami3l5-graphics-drivers-32bit/share /nix/store/3f1v43h6m9k9krhch4qszr85d6ami3l5-graphics-drivers-32bit/share
--
pressure-vessel-wrap[77620]: D: --ro-bind /nix/store/a1370wggay4293caa9z82dc5dazmyg5a-ncurses-6.4.20221231/lib/libncursesw.so.6.4 /nix/store/a1370wggay4293caa9z82dc5dazmyg5a-ncurses-6.4.20221231/lib/libncursesw.so.6.4
pressure-vessel-wrap[77620]: D: --ro-bind /nix/store/a753y1mlbf39xkhifx01dpxczgcjx01h-libXext-1.3.6/lib/libXext.so.6.4.0 /nix/store/a753y1mlbf39xkhifx01dpxczgcjx01h-libXext-1.3.6/lib/libXext.so.6.4.0
pressure-vessel-wrap[77620]: D: --ro-bind /nix/store/b3sjqm35cp6yi6j5cddb893cigzcpcln-libX11-1.8.9/lib/libX11-xcb.so.1.0.0 /nix/store/b3sjqm35cp6yi6j5cddb893cigzcpcln-libX11-1.8.9/lib/libX11-xcb.so.1.0.0
--
pressure-vessel-wrap[77620]: D: '--ro-bind'
pressure-vessel-wrap[77620]: D: '/nix/store/34ggaz0f3s1q39nq6jk44cjw0id2vvnx-libXext-1.3.6/lib/libXext.so.6.4.0'
pressure-vessel-wrap[77620]: D: '/nix/store/34ggaz0f3s1q39nq6jk44cjw0id2vvnx-libXext-1.3.6/lib/libXext.so.6.4.0'
pressure-vessel-wrap[77620]: D: '--bind'
--
pressure-vessel-wrap[77620]: D: '--ro-bind'
pressure-vessel-wrap[77620]: D: '/nix/store/a753y1mlbf39xkhifx01dpxczgcjx01h-libXext-1.3.6/lib/libXext.so.6.4.0'
pressure-vessel-wrap[77620]: D: '/nix/store/a753y1mlbf39xkhifx01dpxczgcjx01h-libXext-1.3.6/lib/libXext.so.6.4.0'
pressure-vessel-wrap[77620]: D: '--ro-bind'
--
pressure-vessel-wrap[77620]: D: --ro-bind
pressure-vessel-wrap[77620]: D: /nix/store/34ggaz0f3s1q39nq6jk44cjw0id2vvnx-libXext-1.3.6/lib/libXext.so.6.4.0
pressure-vessel-wrap[77620]: D: /nix/store/34ggaz0f3s1q39nq6jk44cjw0id2vvnx-libXext-1.3.6/lib/libXext.so.6.4.0
pressure-vessel-wrap[77620]: D: --bind
--
pressure-vessel-wrap[77620]: D: --ro-bind
pressure-vessel-wrap[77620]: D: /nix/store/a753y1mlbf39xkhifx01dpxczgcjx01h-libXext-1.3.6/lib/libXext.so.6.4.0
pressure-vessel-wrap[77620]: D: /nix/store/a753y1mlbf39xkhifx01dpxczgcjx01h-libXext-1.3.6/lib/libXext.so.6.4.0
pressure-vessel-wrap[77620]: D: --ro-bind
--
libXfixes.so.3 (libc6) => /usr/lib/pressure-vessel/overrides/lib/i386-linux-gnu/libXfixes.so.3
libXext.so.6 (libc6,x86-64) => /usr/lib/pressure-vessel/overrides/lib/x86_64-linux-gnu/libXext.so.6
libXext.so.6 (libc6) => /usr/lib/pressure-vessel/overrides/lib/i386-linux-gnu/libXext.so.6
libXdmcp.so.6 (libc6,x86-64) => /usr/lib/pressure-vessel/overrides/lib/x86_64-linux-gnu/libXdmcp.so.6
--
libXfixes.so.3 (libc6) => /usr/lib/pressure-vessel/overrides/lib/i386-linux-gnu/libXfixes.so.3
libXext.so.6 (libc6,x86-64) => /usr/lib/pressure-vessel/overrides/lib/x86_64-linux-gnu/libXext.so.6
libXext.so.6 (libc6) => /usr/lib/pressure-vessel/overrides/lib/i386-linux-gnu/libXext.so.6
libXdmcp.so.6 (libc6,x86-64) => /usr/lib/pressure-vessel/overrides/lib/x86_64-linux-gnu/libXdmcp.so.6
It would be nice, if the logging by default would not leak secrets in the environment like GITHUB_TOKEN.
Sorry, we can't know in advance what the purpose of every environment variable is: some of them are things we will need to know to debug an issue, but we don't necessarily know which ones (if we knew that a particular environment variable was important, we'd have at least a partial implementation of handling it, so in some ways it's the environment variables that we don't know about that are the most important).
If we filtered out known secrets like GITHUB_TOKEN
with a denylist, that would just provide a false sense of security where it would be more surprising that unknown secrets weren't filtered.
I'd recommend only exporting GITHUB_TOKEN
temporarily, during times that you're actively using it.
The problem here isn't that the assumption you saw is wrong: the problem is that capsule-capture-libs
is incorrectly creating symlinks that are pointing to the wrong place. That has the side-effect that it breaks the assumption you saw, but it will also make the symlinks not work as intended, so they'd be wrong even if we didn't have that assumption.
So you're right to think that overrides/lib/i386-linux-gnu/libGL.so.1 points to container-side path /lib32/libGL.so.1
indicates a problem. What should have happened is that capsule-capture-libs
should have created a symlink more like one of these:
overrides/lib/i386-linux-gnu/libGL.so.1 -> /run/host/lib32/libGL.so.1
or
overrides/lib/i386-linux-gnu/libGL.so.1 -> /nix/store/blahblahblah/lib/libGL.so.1.2.3
And you're right that libXext
is an example of a library where this worked correctly: you'll see that instead of "points to container-side path", for libXext
you see a message more like
Exporting /nix/store/34…nx-libXext-1.3.6/lib/libXext.so.6.4.0 because overrides/lib/i386-linux-gnu/libXext.so.6 points to it
which means that library is working as intended.
I suspect that this has been triggered by some change in NixOS or its FHS environment affecting how libGL.so.1
(and libraries in the same situation) are handled, which results in us going onto a different code path that isn't 100% right.
➜ ./result/bin/steam-run ls -lah /lib/libGL.so.1
When you do that, does that mean: enter the "FHS environment", and then ls -lah /lib/libGL.so.1
inside?
I wonder whether it's the extra level of indirection, through /nix/store/ca…px-steam-run-usr-target
, that is making this regress. Is that a recent change?
When I applied NixOS/nixpkgs@806b4f8 (#330468) and NixOS/nixpkgs@f059494 (#330468) which basically turned all our symlinks in our fhs into the actual files (which also infalted the size of the FHS to almost 6GB) things started to work.
Something that I think would be useful to try: instead of creating symlinks to symlinks, if you resolve the canonical physical path of the library below /nix/store
(realpath /lib/libGL.so.1
, etc. inside the FHS environment), and then create a symlink in the FHS environment pointing to that canonical physical path, does that work?
In principle pressure-vessel should be able to cope equally well with the extra layer of indirection, but it's the sort of thing that could easily trip us up somewhere.
Please try this [edited to add: in the non-working FHS environment for which you reported this bug]
/home/sandro/.local/share/Steam/ubuntu12_64/steam-runtime-sniper/var/tmp-1234567
, and it should contain bin
, lib
, lib64
, usr
, etc.dest=$(mktemp -d)
CAPSULE_DEBUG=tool \
/home/sandro/.local/share/Steam/ubuntu12_64/steam-runtime-sniper/pressure-vessel/libexec/steam-runtime-tools-0/x86_64-linux-gnu-capsule-capture-libs \
--remap-link-prefix /app/=/run/host/app/ \
--remap-link-prefix /usr/=/run/host/usr/ \
--remap-link-prefix /lib=/run/host/lib \
--provider / \
--container /home/sandro/.local/share/Steam/ubuntu12_64/steam-runtime-sniper/var/tmp-1234567 \
--dest "$dest" \
'even-if-older:soname-match:libGL.so.*'
ls -l "$dest"
replacing the path to var/tmp-*
and the $dest
as appropriate.
This is a cut-down version of the (very long) command that pressure-vessel is running, with extra debug which I hope will help us figure out what is happening.
(Or you could run the whole container command with CAPSULE_DEBUG=tool
in addition to the debug options that you already enabled, but that will be incredibly verbose!)
I'd recommend only exporting
GITHUB_TOKEN
temporarily, during times that you're actively using it.
That's a good idea. I already have many aliases and could do that with ease.
I suspect that this has been triggered by some change in NixOS or its FHS environment affecting how libGL.so.1 (and libraries in the same situation) are handled, which results in us going onto a different code path that isn't 100% right.
The way the FHS is built didn't change much in the last years. We switched in 2020 from chroot to bubblewrap. Something might have changed in the depending packages like bubblewrap.
➜ ./result/bin/steam-run ./result/bin/steam-run ls -lah /lib/libGL.so.1
lrwxrwxrwx 1 nobody nogroup 79 Jan 1 1970 /lib/libGL.so.1 -> /nix/store/baqnhyii1430mcxfpyhmyxqk8zshbcym-steam-run-usr-target/lib/libGL.so.1
➜ ./result/bin/steam-run ./result/bin/steam-run ls -lah /nix/store/baqnhyii1430mcxfpyhmyxqk8zshbcym-steam-run-usr-target/lib/libGL.so.1
lrwxrwxrwx 1 nobody nogroup 73 Jan 1 1970 /nix/store/baqnhyii1430mcxfpyhmyxqk8zshbcym-steam-run-usr-target/lib/libGL.so.1 -> /nix/store/rcvwvfgk5dkwc3a8sza5q0351h8axbkh-libglvnd-1.7.0/lib/libGL.so.1
I wonder whether it's the extra level of indirection, through /nix/store/ca…px-steam-run-usr-target, that is making this regress. Is that a recent change?
I've looked through the recent changes and couldn't find anything that would be fitting. The fhs code history is at https://github.com/NixOS/nixpkgs/commits/master/pkgs/build-support/build-fhsenv-bubblewrap There where some changes for /etc, a change to LD_LIBRARY_PATH https://github.com/NixOS/nixpkgs/commit/a9600ce894573076f4fe9fda4f820c226f50dcd7 which didn't influence the outcome when reverting.
The changes to steam-fhs are also innocent https://github.com/NixOS/nixpkgs/commits/master/pkgs/games/steam/fhsenv.nix and don't related to ldconfig, /lib or LD_LIBRARY_PATH.
Something that I think would be useful to try: instead of creating symlinks to symlinks, if you resolve the canonical physical path of the library below /nix/store (realpath /lib/libGL.so.1, etc. inside the FHS environment), and then create a symlink in the FHS environment pointing to that canonical physical path, does that work?
I've just tried 30 minutes to create that but couldn't do it because of our bind mount/symlink hell.
I could briefly create it yesterday while debugging but I can't create it right now anymore.
Just running ldconfig -p inside the fhs shows libGL.so.1 correctly.
➜ ./result/bin/steam-run ldconfig -p |& rg libGL.so.1
libGL.so.1 (libc6,x86-64) => /lib/libGL.so.1
libGL.so.1 (libc6) => /lib32/libGL.so.1
It seems that libGL.so.1 is correctly placed in /run/host and then it seems that pressure-vessel somehow gets into thinking that it should be at /lib/libGL.so.1
➜ ./result/bin/steam-run ~/.local/share/Steam/ubuntu12_64/steam-runtime-sniper/run -- ls -lah /lib/libGL.so.1
x86_64-linux-gnu-capsule-capture-libs: warning: Dependencies of libnvidia-pkcs11.so.550.100 not found, ignoring: Missing dependencies: Could not find "libcrypto.so.1.1" in LD_LIBRARY_PATH (unset), ld.so.cache, DT_RUNPATH or fallback /lib:/usr/lib
pressure-vessel-wrap[101412]: W: Found more than one possible libdrm data directory from provider
ls: cannot access '/lib/libGL.so.1': No such file or directory
➜ ./result/bin/steam-run ~/.local/share/Steam/ubuntu12_64/steam-runtime-sniper/run -- ls -lah /run/host/lib/libGL.so.1
x86_64-linux-gnu-capsule-capture-libs: warning: Dependencies of libnvidia-pkcs11.so.550.100 not found, ignoring: Missing dependencies: Could not find "libcrypto.so.1.1" in LD_LIBRARY_PATH (unset), ld.so.cache, DT_RUNPATH or fallback /lib:/usr/lib
pressure-vessel-wrap[101601]: W: Found more than one possible libdrm data directory from provider
lrwxrwxrwx 1 nobody nogroup 79 Jan 1 1970 /run/host/lib/libGL.so.1 -> /nix/store/wmyi11npl3b1r3g53difcasz10w7f76v-steam-run-usr-target/lib/libGL.so.1
➜ ./result/bin/steam-run ~/.local/share/Steam/ubuntu12_64/steam-runtime-sniper/run -- ls -lah /nix/store/wmyi11npl3b1r3g53difcasz10w7f76v-steam-run-usr-target/lib/libGL.so.1
x86_64-linux-gnu-capsule-capture-libs: warning: Dependencies of libnvidia-pkcs11.so.550.100 not found, ignoring: Missing dependencies: Could not find "libcrypto.so.1.1" in LD_LIBRARY_PATH (unset), ld.so.cache, DT_RUNPATH or fallback /lib:/usr/lib
pressure-vessel-wrap[101832]: W: Found more than one possible libdrm data directory from provider
lrwxrwxrwx 1 nobody nogroup 73 Jan 1 1970 /nix/store/wmyi11npl3b1r3g53difcasz10w7f76v-steam-run-usr-target/lib/libGL.so.1 -> /nix/store/d6gvgzzifggrb7fh1v0yi8bvrdlwhpqa-libglvnd-1.7.0/lib/libGL.so.1
I hope that helps you better understand things. Thanks for all the help so far.
Something might have changed in the depending packages like bubblewrap
bubblewrap is really quite simple in many ways and mostly just does as it's told, so I think it's more likely that the trigger for this was something in the FHS environment setup, or in the way the affected libraries like libGL are packaged in NixOS. (Or in SLR, but I don't think we have changed anything recently that would be relevant to this?)
What is the most recent date on which you're reasonably confident that this was working as intended?
➜ ./result/bin/steam-run ./result/bin/steam-run ls -lah /lib/libGL.so.1
Is this setting up a FHS environment inside a FHS environment? That seems like an odd thing to do, and I'd suggest avoiding that - each layer of complexity that you introduce might cause new bugs.
CAPSULE_DEBUG=tool …/x86_64-linux-gnu-capsule-capture-libs …
I am a bit confused why that command seems to have worked correctly.
Yes. This looks like it's operating correctly: it's creating symlinks in $dest
that point to the fully canonicalized path of the library, such as /nix/store/d6gvgzzifggrb7fh1v0yi8bvrdlwhpqa-libglvnd-1.7.0/lib/libGL.so.1.7.0
. As I said in https://github.com/ValveSoftware/steam-runtime/issues/684#issuecomment-2258577417, that's what we want to happen - and if it had done this when it was run as part of SLR, then I think there would be no bug.
Did you run that command inside the FHS environment, or did you run it on the host system? If you ran it on the host system, please re-run it inside the FHS environment. If I'm understanding your system correctly, that might mean prefixing ./result/bin/steam-run env
to it, or getting an interactive shell inside the FHS environment and running the original command from there.
Also, still inside the FHS environment, please run [edited to add: never mind, you already did this, and the answer is ldconfig -p
(you can grep for libGL.so
if you want to make the output less verbose, but it would be useful for me to see the whole thing). Again, ./result/bin/steam-run ldconfig -p
on the host system might be one way to achieve that./lib/libGL.so.1
for 64-bit and /lib32/libGL.so.1
for 32-bit.]
And then please find the line(s) in the ldconfig -p
output that describes libGL.so.1
, and, still inside the FHS environment, resolve their canonical absolute paths as seen inside the FHS environment. For example ./result/bin/steam-run realpath /lib/libGL.so.1
.
If your OS has the namei
tool available, that might also be useful: something like ./result/bin/steam-run namei /lib/libGL.so.1
would clarify how the various levels of symlinks are set up, and then we could compare its output with capsule-capture-libs
debug output.
I think this might also be the point at which I have to ask you to run the whole container with CAPSULE_DEBUG=tool
or even CAPSULE_DEBUG=all
, perhaps something like this:
./result/bin/steam-run \
env CAPSULE_DEBUG=tool \
~/.local/share/Steam/ubuntu12_64/steam-runtime-sniper/run --verbose -- true
The log for that is going to be inconveniently large, but it might be the only way to figure out what is happening.
Running ldconfig -p inside the nixos fhs and then through pressure-vessel doesn't show libGL.so.1.
Yes, we already knew that: the bug we're trying to figure out here is that pressure-vessel (or something that it runs, probably actually capsule-capture-libs
) doesn't set up the libGL.so.1 symlink correctly. As a result of that, when we rebuild the ld.so.cache
inside the SLR container, there is no libGL.so.1
for us to find (most likely it's a dangling symlink), so libGL.so.1
isn't written into the cache. This means it isn't a surprise to see that it's missing from ldconfig -p
output inside the final SLR container.
This is not going to help us to find a solution, because it's too late in the process: the damage has already been done.
What is the most recent date on which you're reasonably confident that this was working as intended?
I've used the flatpak version for the past year because the nixos package with all its 32 bit dependencies ist quite big.
Is this setting up a FHS environment inside a FHS environment? That seems like an odd thing to do, and I'd suggest avoiding that - each layer of complexity that you introduce might cause new bugs.
That's just me copy-pasting something odd but the output stays the same.
Did you run that command inside the FHS environment, or did you run it on the host system? If you ran it on the host system, please re-run it inside the FHS environment. If I'm understanding your system correctly, that might mean prefixing ./result/bin/steam-run env to it, or getting an interactive shell inside the FHS environment and running the original command from there.
I should have mentioned that. I've ran all commands inside steam-run. I can just open bash inside it and from there run most commands. Nothing setuid and I can't really change the file bubblewrap did bind mount.
When I run the commands on the host directly without any fhs env then things already break very early on.
➜ ~/.local/share/Steam/ubuntu12_64/steam-runtime-sniper/run -- ls -lah /lib/libGL.so.1
pressure-vessel-wrap[83014]: E: Child process exited with code 1: bwrap: execvp true: No such file or directory
[edited to add: never mind, you already did this, and the answer is /lib/libGL.so.1 for 64-bit and /lib32/libGL.so.1 for 32-bit.]
Yes, that is correct.
And then please find the line(s) in the ldconfig -p output that describes libGL.so.1, and, still inside the FHS environment, resolve their canonical absolute paths as seen inside the FHS environment. For example ./result/bin/steam-run realpath /lib/libGL.so.1.
➜ ./result/bin/steam-run ldconfig -p | rg libGL.so.1
libGL.so.1 (libc6,x86-64) => /lib/libGL.so.1
libGL.so.1 (libc6) => /lib32/libGL.so.1
➜ ./result/bin/steam-run realpath /lib/libGL.so.1
/nix/store/rcvwvfgk5dkwc3a8sza5q0351h8axbkh-libglvnd-1.7.0/lib/libGL.so.1.7.0
➜ ./result/bin/steam-run ls -lah /lib/libGL.so.1
lrwxrwxrwx 2 nobody nogroup 79 Jan 1 1970 /lib/libGL.so.1 -> /nix/store/gc99smwq20jcbhjcbr8y0w9dsygc1njm-steam-run-usr-target/lib/libGL.so.1
➜ ./result/bin/steam-run ls -lah /nix/store/gc99smwq20jcbhjcbr8y0w9dsygc1njm-steam-run-usr-target/lib/libGL.so.1
lrwxrwxrwx 8 nobody nogroup 73 Jan 1 1970 /nix/store/gc99smwq20jcbhjcbr8y0w9dsygc1njm-steam-run-usr-target/lib/libGL.so.1 -> /nix/store/rcvwvfgk5dkwc3a8sza5q0351h8axbkh-libglvnd-1.7.0/lib/libGL.so.1
➜ ./result/bin/steam-run ls -lah /nix/store/rcvwvfgk5dkwc3a8sza5q0351h8axbkh-libglvnd-1.7.0/lib/libGL.so.1
lrwxrwxrwx 17 nobody nogroup 14 Jan 1 1970 /nix/store/rcvwvfgk5dkwc3a8sza5q0351h8axbkh-libglvnd-1.7.0/lib/libGL.so.1 -> libGL.so.1.7.0
➜ ./result/bin/steam-run realpath /lib32/libGL.so.1
/nix/store/qv5qnw55sva7i0wvbpi7mskk90sdmq4z-libglvnd-1.7.0/lib/libGL.so.1.7.0
➜ ./result/bin/steam-run ls -lah /lib32/libGL.so.1
lrwxrwxrwx 2 nobody nogroup 78 Jan 1 1970 /lib32/libGL.so.1 -> /nix/store/4y9441mywsibp9ic2pklz284pf15d4mm-steam-run-usr-multi/lib/libGL.so.1
➜ ./result/bin/steam-run ls -lah /nix/store/4y9441mywsibp9ic2pklz284pf15d4mm-steam-run-usr-multi/lib/libGL.so.1
lrwxrwxrwx 4 nobody nogroup 73 Jan 1 1970 /nix/store/4y9441mywsibp9ic2pklz284pf15d4mm-steam-run-usr-multi/lib/libGL.so.1 -> /nix/store/qv5qnw55sva7i0wvbpi7mskk90sdmq4z-libglvnd-1.7.0/lib/libGL.so.1
➜ ./result/bin/steam-run ls -lah /nix/store/qv5qnw55sva7i0wvbpi7mskk90sdmq4z-libglvnd-1.7.0/lib/libGL.so.1
lrwxrwxrwx 17 nobody nogroup 14 Jan 1 1970 /nix/store/qv5qnw55sva7i0wvbpi7mskk90sdmq4z-libglvnd-1.7.0/lib/libGL.so.1 -> libGL.so.1.7.0
If your OS has the namei tool available, that might also be useful: something like ./result/bin/steam-run namei /lib/libGL.so.1 would clarify how the various levels of symlinks are set up, and then we could compare its output with capsule-capture-libs debug output.
OFC we have util-linux commands :)
➜ ./result/bin/steam-run namei /lib/libGL.so.1
f: /lib/libGL.so.1
d /
l lib -> usr/lib
d usr
l lib -> lib64
d lib64
l libGL.so.1 -> /nix/store/gc99smwq20jcbhjcbr8y0w9dsygc1njm-steam-run-usr-target/lib/libGL.so.1
d /
d nix
d store
d gc99smwq20jcbhjcbr8y0w9dsygc1njm-steam-run-usr-target
d lib
l libGL.so.1 -> /nix/store/rcvwvfgk5dkwc3a8sza5q0351h8axbkh-libglvnd-1.7.0/lib/libGL.so.1
d /
d nix
d store
d rcvwvfgk5dkwc3a8sza5q0351h8axbkh-libglvnd-1.7.0
d lib
l libGL.so.1 -> libGL.so.1.7.0
- libGL.so.1.7.0
➜ ./result/bin/steam-run namei /lib32/libGL.so.1
f: /lib32/libGL.so.1
d /
l lib32 -> usr/lib32
d usr
d lib32
l libGL.so.1 -> /nix/store/4y9441mywsibp9ic2pklz284pf15d4mm-steam-run-usr-multi/lib/libGL.so.1
d /
d nix
d store
d 4y9441mywsibp9ic2pklz284pf15d4mm-steam-run-usr-multi
d lib
l libGL.so.1 -> /nix/store/qv5qnw55sva7i0wvbpi7mskk90sdmq4z-libglvnd-1.7.0/lib/libGL.so.1
d /
d nix
d store
d qv5qnw55sva7i0wvbpi7mskk90sdmq4z-libglvnd-1.7.0
d lib
l libGL.so.1 -> libGL.so.1.7.0
- libGL.so.1.7.0
The log for that is going to be inconveniently large, but it might be the only way to figure out what is happening.
➜ ./result/bin/steam-run env CAPSULE_DEBUG=tool ~/.local/share/Steam/ubuntu12_64/steam-runtime-sniper/run --verbose -- true |& gh gist create -
https://gist.github.com/SuperSandro2000/227a53fecad510ffcea42076dc1d690c
Aha. So, I think the problem is that we're checking for libGL.so.1
twice:
ld.so.cache
looking for anything resembling libGL.so.*
Last time you tried, a year ago, we only did (1.); but we added (2.) to address issues similar to #632, which means this would have regressed in late 2023 (beta) or early 2024 (stable).
The problem is that when we do (1.), it creates a symlink to the realpath()
of the library, which is what we want:
cache_foreach_cb:libGL.so.1 matches libGL.so.*
capture_one:Explicitly requested libGL.so.1 from / even if older: "/nix/store/d6gvgzzifggrb7fh1v0yi8bvrdlwhpqa-libglvnd-1.7.0/lib/libGL.so.1.7.0"
capture_one:Link target initially: "/nix/store/d6gvgzzifggrb7fh1v0yi8bvrdlwhpqa-libglvnd-1.7.0/lib/libGL.so.1.7.0"
capture_one:Creating symlink /tmp/tmp.LH7H92kQbN/libGL.so.1 -> /nix/store/d6gvgzzifggrb7fh1v0yi8bvrdlwhpqa-libglvnd-1.7.0/lib/libGL.so.1.7.0
... but then when we do (2.), it goes differently:
capture_pattern:soname:libGL.so.1
capture_one:Explicitly requested libGL.so.1 from / even if older: "/lib/libGL.so.1"
capture_one:Link target initially: "/lib/libGL.so.1"
capture_one:Link target pursued to: "/nix/store/gc99smwq20jcbhjcbr8y0w9dsygc1njm-steam-run-usr-target/lib/libGL.so.1"
capture_one:Link target pursued to: "/nix/store/rcvwvfgk5dkwc3a8sza5q0351h8axbkh-libglvnd-1.7.0/lib/libGL.so.1"
capture_one:Link target pursued to: "/nix/store/rcvwvfgk5dkwc3a8sza5q0351h8axbkh-libglvnd-1.7.0/lib/libGL.so.1.7.0"
capture_one:Creating symlink /proc/self/fd/18/lib/x86_64-linux-gnu/libGL.so.1 -> /lib/libGL.so.1
and because we do (2.) first, (1.) sees that the symlink already exists and doesn't get a chance to re-create it.
I think this probably just means that we need to make the symlink point to the realpath()
of the library, and/or put the /run/host
prefix on it if necessary (the former is probably better than the latter).
OFC we have util-linux commands :)
If you were trying to cope with as many weird Linux distributions as SLR does, you wouldn't be saying "of course" as though that was obviously true :-P
No, I don't get it. Those two invocations of capsule-capture-libs both end up in the same code path... but one of them finds libGL.so.1
at /lib/libGL.so.1
, the other has already realpath()
'd it to a path below /nix/store/
, and I don't understand why there's a difference.
I think I do understand the bug here, though. After we do a realpath()
on the library's filename, we try to decide whether we genuinely need to use the result (which is somewhat fragile against the host system being upgraded under us, although I think that can't actually happen on NixOS because of the way it works), or whether we can keep using the original path (which is somewhat more robust).
But when we do so, we only consider whether the realpath()
needed the /run/host/
prefix prepended to turn it into a path that's resolvable in the container (in NixOS, the answer is: no), and we wrongly don't take into account whether the original filename would have also needed the /run/host/
prefix prepended (in NixOS' FHS environment, the answer is: yes).
@refi64, are you able to reproduce this?
I think I've found a solution. Please could you try unpacking https://gitlab.steamos.cloud/steamrt/steam-runtime-tools/-/jobs/640701/artifacts/raw/_build/pressure-vessel-bin.tar.gz, then replacing ~/.local/share/Steam/ubuntu12_64/steam-runtime-sniper/pressure-vessel
with the result?
If you use the Proton or Steam Linux Runtime compatibility tools, you'll also want to do the same for steamapps/common/SteamLinuxRuntime_soldier/pressure-vessel
and/or steamapps/common/SteamLinuxRuntime_sniper/pressure-vessel
in your Steam library.
(Note to self: this is !739 v2)
@SuperSandro2000, if you can test the version above ^ and it resolves the failure you are seeing, then we can look at getting that change integrated.
I think I do understand the bug here, though. After we do a realpath() on the library's filename, we try to decide whether we genuinely need to use the result (which is somewhat fragile against the host system beinrg upgraded under us, although I think that can't actually happen on NixOS because of the way it works), or whether we can keep using the original path (which is somewhat more robust).
The store path can't be upgraded. It could be deleted but not if the program has loaded it or otherwise holds a fd or similar on it (/proc//maps, /proc//environ, /proc//exe, /proc//fd/*).
I think I've found a solution. Please could you try unpacking gitlab.steamos.cloud/steamrt/steam-runtime-tools/-/jobs/640701/artifacts/raw/_build/pressure-vessel-bin.tar.gz, then replacing
~/.local/share/Steam/ubuntu12_64/steam-runtime-sniper/pressure-vessel
with the result?
That seems to work 👍🏼
➜ ./result/bin/steam-run ~/.local/share/Steam/ubuntu12_64/steam-runtime-sniper/run -- ldconfig -p |& rg libGL.so.1
libGL.so.1 (libc6,x86-64) => /usr/lib/pressure-vessel/overrides/lib/x86_64-linux-gnu/libGL.so.1
libGL.so.1 (libc6) => /usr/lib/pressure-vessel/overrides/lib/i386-linux-gnu/libGL.so.1
➜ ./result/bin/steam-run ~/.local/share/Steam/ubuntu12_64/steam-runtime-sniper/run -- ls -lah /run/host/lib/libGL.so.1
pressure-vessel-wrap[50565]: W: Found more than one possible libdrm data directory from provider
lrwxrwxrwx 1 nobody nogroup 79 Jan 1 1970 /run/host/lib/libGL.so.1 -> /nix/store/798w5pwvvlmrg30kymb2pvcfvb0njcs9-steam-run-usr-target/lib/libGL.so.1
If you use the Proton or Steam Linux Runtime compatibility tools, you'll also want to do the same for steamapps/common/SteamLinuxRuntime_soldier/pressure-vessel and/or steamapps/common/SteamLinuxRuntime_sniper/pressure-vessel in your Steam library.
I replaced both occurences and then games like Mini Metro straight up worked again and others like Mini Subway at least could display an error message. Thank you so much!
if you can test the version above ^ and it resolves the failure you are seeing, then we can look at getting that change integrated.
Don't get me in a hurry 😅 I migrated a mail setup for 50 users the last days.
Thanks for testing, I'll look into getting that change integrated for a future beta release (and eventually a stable release).
For the version of SLR that is used to run games, this change shipped as a beta yesterday. Please try the client_beta
branch of the "Steam Linux Runtime 3.0 (sniper)" compatibility tool.
Using the beta version of SLR is the same as using the beta branch of a game, but instead of looking for the game in your Steam library, you should look for "Steam Linux Runtime 3.0 (sniper)".
The version of SLR that is used for steamwebhelper has a different update schedule and is not affected by this change, so for now, you'll need to continue to use a workaround to launch Steam and steamwebhelper
on affected systems. One workaround is to get a preview of future versions by replacing ~/.local/share/Steam/ubuntu12_64/steam-runtime-sniper/pressure-vessel
with the result of unpacking a recent pressure-vessel-bin.tar.gz
from https://repo.steampowered.com/pressure-vessel/snapshots/. Versions ≥ 0.20240806.0 include the fix for this issue, superseding the test-build that I linked in a previous comment.
After upgrading the Steam Linux Runtime 3.0 (sniper) compatibility tool, it should also be enough to run Steam with the environment variable STEAM_RUNTIME_SNIPER
set to the full, absolute path to your steamapps/common/SteamLinuxRuntime_sniper
directory.
I've switch to the runtime sniper beta and verified the files of it and things continued to work like with your snapshot.
The version of SLR that is used for steamwebhelper has a different update schedule and is not affected by this change
This change has been in the Steam client beta branch for a while, and finally reached the general-availability branch last week with the September 11th update (the same one that introduced Steam Families). I think this issue can be closed now.
[For reference: the important factor is that the version of pressure-vessel that gets unpacked from steam-runtime-sniper.tar.xz
is 0.20240806.0 or later.]
Your system information
steamapps/common/SteamLinuxRuntime/VERSIONS.txt
?steamapps/common/SteamLinuxRuntime_soldier/VERSIONS.txt
?steamapps/common/SteamLinuxRuntime_sniper/VERSIONS.txt
?Please describe your issue in as much detail as possible:
NixOS is a bit of a special child and loves symlinks and does not really adhere to FHS. We run steam inside of a bubblewrap (we call it fhs env) with a wrapper called steam-run which can be used to execute any command in the fhs. The libraries we want to load are symlinked through multiple layers eg
Shared objects which I assume are written into RPATH are detected and found correctly by pressure-vessel.
But libGL.so.1 is somehow not found correctly since I think recently but I am not sure. I think pressure-vessel assumes that the symlink it finds inside steam-run is already in its own container while it is in steam-runs container. I found a code piece which would fit to my assumption well enough for me to believe it https://gitlab.steamos.cloud/steamrt/steam-runtime-tools/-/blob/main/pressure-vessel/exports.c#L37-61 I think for NixOS the comment is totally not true and wrong.
I think the most important line from those is
pressure-vessel-wrap[802533]: D: overrides/lib/x86_64-linux-gnu/libGL.so.1 points to container-side path /lib/libGL.so.1
which is wrong (it is on the host) and results that the path is not copied correctly and inside pressure vessel a dangling symlink is created.Inside the steam-run fhs the file is located at that location:
Steps for reproducing this issue:
./result/bin/steam-run ~/.local/share/Steam/ubuntu12_64/steam-runtime-sniper/run --verbose -- ldconfig -p |& rg libGL.so.1