ValveSoftware / steam-for-linux

Issue tracking for the Steam for Linux beta client
4.26k stars 174 forks source link

"Steamwebhelper is not responding" crash menu with home folder on NFS #10431

Open glabifrons opened 10 months ago

glabifrons commented 10 months ago

Your system information

Please describe your issue in as much detail as possible:

Expected: Steam launches as normal.

Result: "Steamwebhelper is not responding" crash dialog appears. image

Steps for reproducing this issue:

  1. Switch to Steam Beta
  2. Restart Steam
  3. Observe crash dialog options either cause it to exit completely or don't appear to do anything (dialog goes away with Steam still running in the background in those cases).

Details: This started several days back. By my logs, it looks like the last time I successfully launched Steam's beta was on 2024-01-19. I went through issue #10412 which had the same initial dialog, but ruled out the same root cause.

  1. Executing "./run-in-sniper vkcube" from the steam-runtime-sniper directory worked and displayed a spinning cube.
  2. Even though the above appeared to confirm the sniper installation, I verified Steam was not running (ps -ef | grep -i steam) then renamed the steam-runtime-sniper directory to force it to be re-extracted per instructions in the other ticket. Restarting Steam at this point still resulted in the crash dialog.
  3. I switched back to the production Steam using "steam -clearbeta" and Steam came up properly at the next launch.
  4. Switching back to Public Beta again resulted in the same crash (this was repeated multiple times).

While writing this up, I noticed another issue (#10417) that indicated some people were having better luck upgrading to NVidia driver 545 from 535 (which I was using). I upgraded to 545 using Ubuntu's packages and tried switching back to Steam's beta after the upgrade (and reboot) with the same results reported above. To be absolutely sure I followed each tip in #10412, I even removed steam-runtime-sniper before switching from release to beta on the last attempt. No change in symptoms.

Observation: On a couple attempts, I noticed that Steam was going through the various Proton installations (one by one) and running .local/share/Steam/ubuntu12_32/../bin/d3ddriverquery64.exe even after I selected the exit option from the dialog. I left this running to completion hoping that would solve the issue (figuring that maybe it's an incomplete driver installation within Proton or something similar), but this appeared to make no difference.

Other: I doubt this matters, but it was related to two Steam bugs in the past so I will note it here: My home directory is mounted via NFS with the Solaris server's backing filesystem being ZFS. Several years back I had to create a 2TB quota on my steam installation share to work around #4982. The other issue (with using flock on NFS) has since been resolved (I no longer use the workaround). These are the only things I would consider odd or unusual about my installation.

smcv commented 10 months ago

I agree that this is not the same problem as #10412, despite the superficially similar symptoms. In #10412, we don't get as far as the container runtime starting. In this issue, the container starts up fine and hands over control to steamwebhelper, but the steamwebhelper is crashing.

This in steamwebhelper.log looks bad, and perhaps distinctive:

[0125/020544.602831:ERROR:nss_util.cc(357)] After loading Root Certs, loaded==false: NSS error code: -8018

8018 seems to be SEC_ERROR_UNKNOWN_PKCS11_ERROR. Is there anything unusually set up on your system involving PKCS11 or certificates, perhaps?

My home directory is mounted via NFS ... The other issue (with using flock on NFS)

I should mention here that one possible route towards solving #10412 is to use the flock(1) utility to take out a lock on a file or directory in ~/.steam/root/ubuntu12_64, to prevent concurrent access - so perhaps preemptively check that you can do that.

I would recommend putting the Steam installation (usually ~/.local/share/Steam) on a local filesystem that is well-supported by Linux (ext4, xfs, btrfs, that sort of thing), and doing the same for the Steam library that contains your compatibility tools (all versions of Proton and Steam Linux Runtime) if different, even if you also have a secondary Steam library on NFS. Some of the things that the container runtime framework needs to do are metadata operations that have really bad performance on remote filesystems.

davispuh commented 10 months ago

Steam Beta is borked, I can reproduce this crash on Arch Linux with clean Steam (removed ~/.steam and ~/.local/share/Steam) and then after enabling Beta

XRRGetOutputInfo Workaround: initialized with override: 0 real: 0xe3432dc0
XRRGetCrtcInfo Workaround: initialized with override: 0 real: 0xe3431500
steamwebhelper.sh[119628]: === piektdiena, 2024. gada 26. janvāris, 22:46:21 EET ===
steamwebhelper.sh[119628]: Starting steamwebhelper under bootstrap sniper steam runtime at ~/.local/share/Steam/ubuntu12_64/steam-runtime-sniper
Steam Runtime Launch Service: starting steam-runtime-launcher-service
Steam Runtime Launch Service: steam-runtime-launcher-service is running pid 119719
bus_name=com.steampowered.PressureVessel.LaunchAlongsideSteam
CAppInfoCacheReadFromDiskThread took 515 milliseconds to initialize
steamwebhelper.sh[119917]: === piektdiena, 2024. gada 26. janvāris, 22:46:31 EET ===
steamwebhelper.sh[119917]: Starting steamwebhelper under bootstrap sniper steam runtime at ~/.local/share/Steam/ubuntu12_64/steam-runtime-sniper
steamwebhelper.sh[120126]: === piektdiena, 2024. gada 26. janvāris, 22:46:42 EET ===
steamwebhelper.sh[120126]: Starting steamwebhelper under bootstrap sniper steam runtime at ~/.local/share/Steam/ubuntu12_64/steam-runtime-sniper
src/steamUI/steamuisharedjscontroller.cpp (545) : Failed creating offscreen shared JS context
src/steamUI/steamuisharedjscontroller.cpp (545) : Failed creating offscreen shared JS context
01/26 22:46:45 Init: Installing breakpad exception handler for appid(steam)/version(1706155871)/tid(119545)
assert_20240126224645_30.dmp[120310]: Uploading dump (out-of-process)
/tmp/dumps/assert_20240126224645_30.dmp
assert_20240126224645_30.dmp[120310]: Finished uploading minidump (out-of-process): success = yes
assert_20240126224645_30.dmp[120310]: response: CrashID=bp-1e48b135-8144-4efd-a813-f1b892240126
assert_20240126224645_30.dmp[120310]: file ''/tmp/dumps/assert_20240126224645_30.dmp'', upload yes: ''CrashID=bp-1e48b135-8144-4efd-a813-f1b892240126''
[2024-01-26 22:46:54] Shutdown

Current workaround is to switch back to non-Beta:

$ rm -f ~/.local/share/Steam/package/beta
glabifrons commented 10 months ago

@smcv I'm not sure what to look for with regard to the certificates. I don't believe anything is non-standard there. Pointers as to what to check would be appreciated. The machine is my daily-driver and I've not seen anything else fail relating to certs/TLS/etc.

As to the flock issue: The core problem is NFS doesn't support flock(), that only works for local filesystems. The previously mentioned workaround was a tiny C program someone contributed to #5788 called "fakeflock.c". I just tried that same workaround with this beta, and it did not change the ultimate outcome. However, I did notice a dialog with a moving progress bar that I didn't see yesterday. In the other ticket you mentioned that there likely going to be another update to re-add it. I'm guessing that code has been updated and it's not due to my preloading the libfakeflock library.

I'd rather not move everything to a local filesystem as I have the NFS server for convenient backups and rolling snapshots (every 15 minutes) for the entire family (thankfully, they're not on the beta). Snapshot rollbacks have saved each of us from various catastrophes numerous times. I've been using it this way for well over a decade and it's worked very well for us so far. If NFS is the issue, we're likely not the only ones that will be impacted by this due to the popularity of various DIY/home NAS solutions out there.

In an attempt to test it out, I unmounted, unshared, and relocated my steam filesystem on the server to prevent automatic mounting, then created a Steam subdirectory on a local filesystem with enough space, linked ~/.local/share/Steam back to this location, removed ~/.steam*, the relaunched Steam to force a full re-install. I then logged into Steam, switched over to the beta, at which point it segfaulted (I grabbed the logs, below). I then restarted Steam again and the beta came right up. So it seems we've managed to narrow it down to NFS. steam-logs-local.tar.gz

Crash text:

crash_20240126194512_39.dmp[451629]: Uploading dump (out-of-process)
/tmp/dumps/crash_20240126194512_39.dmp
crash_20240126194512_39.dmp[451629]: Finished uploading minidump (out-of-process): success = yes
crash_20240126194512_39.dmp[451629]: response: CrashID=bp-9cc63e34-48bf-4e6b-bbf3-a3ec02240126
crash_20240126194512_39.dmp[451629]: file ''/tmp/dumps/crash_20240126194512_39.dmp'', upload yes: ''CrashID=bp-9cc63e34-48bf-4e6b-bbf3-a3ec02240126''
/var/cache/fscache/Steam/steam.sh: line 798: 450752 Segmentation fault      (core dumped) "$STEAMROOT/$STEAMEXEPATH" "$@"
glabifrons commented 10 months ago

@davispuh Is your home directory (or Steam installation path) mounted via NFS? If not, what filesystem are you using? What mount options? I think we're making progress on narrowing this down.

davispuh commented 10 months ago

@davispuh Is your home directory (or Steam installation path) mounted via NFS? If not, what filesystem are you using? What mount options? I think we're making progress on narrowing this down.

Well it's bit complicated :joy: My home directory is not using NFS but local btrfs but I do have it exported with NFS and I'm also using bind mounts, subvolumes, btrfs raid1 and md raid6.

Here's summary of mounts:

/dev/nvme1n1p2 on / type btrfs (rw,noatime,ssd,discard=async,space_cache=v2,subvolid=256,subvol=/Arch)
/dev/sdp       on /home type btrfs (rw,noatime,compress=lzo,space_cache=v2,subvolid=593,subvol=/home,x-systemd.automount)
/dev/md127     on /mnt/Data type btrfs (rw,noatime,compress=zstd:3,space_cache=v2,subvolid=257,subvol=/Data,x-systemd.automount,x-systemd.mount-timeout=10m)
/dev/sdp       on /mnt/RAID type btrfs (rw,noatime,compress=lzo,space_cache=v2,subvolid=395,subvol=/RAID,x-systemd.automount,x-systemd.mount-timeout=10m)
/dev/sdp       on /srv/nfs/RAID type btrfs (rw,noatime,compress=lzo,space_cache=v2,subvolid=395,subvol=/RAID,x-systemd.automount)

That last /dev/sdp is bind mount, fstab looks like this:

UUID=ee7e665c-3de5-43e3-80b8-d312bdf58dae  /  btrfs  rw,noatime,noautodefrag,ssd,space_cache=v2,subvol=Arch  0  0
UUID=cf489774-f2f9-4d80-9cb7-08ebad25bfb3  /home  btrfs  rw,noatime,noautodefrag,space_cache=v2,compress=lzo,subvol=home,noauto,x-systemd.automount  0  0
UUID=cf489774-f2f9-4d80-9cb7-08ebad25bfb3  /mnt/RAID  btrfs  rw,noatime,noautodefrag,space_cache=v2,compress=lzo,subvol=RAID,noauto,x-systemd.automount,x-systemd.device-timeout=10m,x-systemd.mount-timeout=10m  0  0
UUID=502744d5-e441-47d1-ab41-bcf2eb800e2f  /mnt/Data  btrfs  rw,noatime,space_cache=v2,compress=zstd,subvol=Data,noauto,x-systemd.automount,x-systemd.device-timeout=10m,x-systemd.mount-timeout=10m  0  0
/mnt/RAID                                  /srv/nfs/RAID  none   bind,noauto,x-systemd.automount  0  0

And exports

/srv/nfs/RAID       172.16.0.0/25(rw,fsid=100,insecure,no_subtree_check,anonuid=20000,anongid=20000) 172.24.0.0/25(rw,fsid=100,insecure,no_subtree_check,anonuid=20000,anongid=20000)
/home/Dāvis         172.16.0.0/28(rw,fsid=101,insecure,no_subtree_check,anonuid=20000,anongid=20000) 172.24.0.0/28(rw,fsid=101,insecure,no_subtree_check,anonuid=20000,anongid=20000)

PS. These are just relevant excerpts

glabifrons commented 10 months ago

@davispuh Unfortunately, I know very little about brtfs, so I don't know where the overlap between its capabilities and NFSv4 would be.

@smcv I forgot to mention that I'm running NFSv4, which has different locking mechanisms than NFSv3 (neither of which natively support flock). I just tried flock_to_setlk from #5788 by @DataBeaver (compiled into both 32 bit and 64 bit versions, both preloaded via LD_PRELOAD) with no luck. While flock is triggered many times, this workaround (which is more functional than the fakeflock one) did not solve the problem. Further down in that issue discussion, @eqvinox has an excellent description of the limitations of NFSv4 and which calls to use.

DataBeaver commented 10 months ago

I did some testing due to getting pinged. After switching to the beta, at first the Steam UI didn't show up at all and I also didn't get this "not responding" dialog. This reproduced a couple of times. I then tried running Steam on a local drive (ext4 filesystem), which worked. After that running on the NFS mount worked too. To further confirm the functionality I tried it on another computer, with the same NFS mount. There I got the "not responding" dialog, but the Steam UI showed up and worked as well. After choosing to restart either steam or just steamwebhelper the dialog did not reappear. I have to get on with other things, but maybe I'll try a clean install on NFS later.

Both computers are running Debian unstable and have Nvidia GPUs with driver version 525.147.05.

Edit: I ran the local Steam installation by changing HOME to point at a local mount. It's not impossible that it could have affected the installation on NFS, though it definitely did run from the local drive.

glabifrons commented 10 months ago

Following DataBeaver's lead, I redid my fresh local install per my above description (using a link). I then copied (via tar to preserve links) the installation back to NFS to see if it was only the installer with the issue, or if it was a post-installation problem. I launched Steam and ended up with the same dialog, so I believe the installer is ruled out.

However, one thing I noticed was a large number of errors when running a diff between the local and NFS copies. I believe all of these were dead links.

~/.steam$ find root/ -xtype l -exec test ! -e {} \; -ls | wc -l
411

The confusing part is, this is true of both installations, so I'm not even sure how sniper can run in the local installation if this is causal in the NFS installation (so this very well may be a red-herring).

Digging into them in more detail, most appear to be dead links to the /run/ hierarchy, so can be ignored.

~/.steam$ find root/ -xtype l -exec test ! -e {} \; -ls  | grep -v ' /run/' | wc -l
101

Filtering down to absolute path links, I see the certificate issue you mentioned. It appears to be looking for different filenames than I have on my system, as I see a mixture of near-misses (eg: different numbers) and completely missing ones (eg: Staat*). From my below findings, this is likely not relevant, but I'm leaving this here for context.

Next I see that it's linking to font configs that don't exist in /usr/share/fontconfig/conf.avail/ (I have 29 total in that directory, perhaps I'm missing a package?). Again, this is likely not relevant, given the below.

Digging into a few of the oddballs at the end of the list, it's become apparent to me that this runs in some sort of chrooted environment with multiple filesystem overlays or something along those lines (sorry, I'm unfamiliar with exactly what the steam-runtime-* installations do), as I'm finding the absolute paths stuffed under various different hierarchies in the Steam installation. Examples: /etc/python3.9/sitecustomize.py is found under ubuntu12_64/steam-runtime-sniper/sniper_platform_0.20240125.75305/files/ /usr/lib/i386-linux-gnu/libXaw.so.7 is found under ubuntu12_32/steam-runtime/

Extrapolating this back to the certs, and it looks like they are actually there too, but in yet another subdirectory. For example, /usr/share/ca-certificates/mozilla/Staat_der_Nederlanden_EV_Root_CA.crt is in ubuntu12_64/steam-runtime-sniper/var/tmp-74JLI2/

Backing up and filtering for relative paths, I find a few additional causes. Some are simply missing targets (eg: 4 occurrences of an selinux link that point to a non-existing entry in its parent) where the file doesn't exist in the entire installation. The certs seem to be pointing to the wrong directory. For example the links in the /usr/etc/ssl/certs/ directory point to the current directory, while the files are actually found in /etc/ssl/certs/ instead (/etc instead of /usr/etc, but both the link and extant file paths are under the ubuntu12_64/steam-runtime-sniper/var/tmp-74JLI2 path). The os-release link is specified one level too deep (would resolve to */usr/usr/lib/os-release). This appears to also be true of all of the dead library links.

Here's the list of dead links with the /run/ links stripped out: dead-links.txt

smcv commented 10 months ago

I can't tell at this stage whether the problem that @glabifrons is having is to do with NFS or not, so this might all be a red herring. But, we are going to need this sooner or later, so...

The core problem is NFS doesn't support flock()

Does it support POSIX process-associated record locks (fcntl F_SETLKW) and/or Linux open-file-description locks (fcntl F_OFD_SETLKW)?

We are going to need to put some sort of locking into place, otherwise we get bizarre failure modes like one process deleting a temporary runtime that another process is still using. Sorry, but avoiding that is more important than supporting NFS. If flock(1) and flock(2) are unavailable, a different locking mechanism is a possibility, but having no locking at all is not really an option.

At the moment, the container runtime tries to use the Linux-specific fcntl F_OFD_SETLKW, falling back to POSIX fcntl F_SETLKW on ancient kernels. You could test this: with Steam not running,

adverb="$HOME/.steam/root/ubuntu12_64/steam-runtime-sniper/pressure-vessel/bin/pressure-vessel-adverb"
ref="$HOME/.steam/root/ubuntu12_64/steam-runtime-sniper/.ref"

"$adverb" --write --wait --lock-file="$ref" -- sleep 600 &
"$adverb" --write --lock-file="$ref" -- true
"$adverb" --write --wait --lock-file="$ref" -- true

(If necessary, copy the whole steam-runtime-sniper directory into a temporary location on NFS, and adjust the paths accordingly.)

The "$adverb" --write --lock-file="$ref" -- true command (without --wait) should fail with error message "E: Unable to lock ... for writing: file is busy".

The last command (with --wait) should block, with no output, until you kill the sleep process (or wait 10 minutes for it to exit on its own), at which point the last command should exit successfully.

The previously mentioned workaround was a tiny C program someone contributed to https://github.com/ValveSoftware/steam-for-linux/issues/5788 called "fakeflock.c"

Disabling locking like this is not a solution. This will lead to concurrent processes all believing that they have the lock at the same time, and overwriting or deleting files that the other concurrent processes were using.

smcv commented 10 months ago

@davispuh:

We do not have enough information on this issue to be able to guess whether the failure mode you are seeing with the beta is the same as @glabifrons is seeing, or the same as #10412, or some different thing. Please look at the logs in ~/.steam/root/logs/, especially steamwebhelper.log and webhelper.txt.

If we try to handle multiple different problems on the same issue number, it quickly becomes really confusing, which makes it take longer to solve any of the problems that were reported; so we should reserve this issue number for the specific problem that @glabifrons is experiencing (which unfortunately we have not yet been able to identify). If we can identify that something different is going wrong for you, please open a separate issue for that, with a title that is as specific as possible.

My home directory is not using NFS but local btrfs but I do have it exported with NFS and I'm also using bind mounts, subvolumes, btrfs raid1 and md raid6.

I don't know whether any of these will interfere with the container runtime. My first guess would be that RAID shouldn't matter, because that's at a lower level than anything we're doing, but the others might. If you can try launching Steam on the same system but from a home directory that is as "ordinary and boring" as possible (perhaps by creating a temporary user whose home directory is on local disk and is not NFS-exported, and logging in as that user) then that will help to narrow down whether any of these less-usual configurations are involved.

smcv commented 10 months ago

@DataBeaver:

There I got the "not responding" dialog, but the Steam UI showed up and worked as well

Unfortunately, I think this is normal if it takes an unusually long time for the steamwebhelper to start. Steam cannot currently distinguish between multiple different reasons why the steamwebhelper might fail to start, and it also cannot currently distinguish between "it's taking a long time, but might still work" and "it will never work, however long we wait".

If your NFS mount has enough latency to make small metadata operations like link(2) and chmod(2) unexpectedly slow, then it's going to take a while to start. We've seen this before with Steam libraries on other network filesystems like SMB.

smcv commented 10 months ago

Back to @glabifrons:

it's become apparent to me that this runs in some sort of chrooted environment with multiple filesystem overlays or something along those lines

Yes, it's the Steam container runtime, which has quite a lot of code in common with Flatpak. It's normal that some of the files below steam-runtime-sniper/ are symbolic links to filenames that don't exist on your host system. As long as those symlinks work correctly inside the container, everything is fine.

Looking at your list of dangling symlinks, the majority of them are very likely to work as intended inside the container. I do notice one bug, but it's a bug that will only affect developers who are running this stuff in a non-default configuration that isn't relevant to end-user systems.

If you are copying Steam installations between filesystems, you can delete all of steam-runtime-sniper/var/ instead of copying it: the subdirectories in there are temporary, and are deleted and re-created automatically. In fact, you could even delete all of steam-runtime-sniper/, because Steam will automatically unpack it from steam-runtime-sniper.tar.xz.

You can verify that steam-runtime-sniper/ has the expected contents by running:

~/.steam/root/ubuntu12_64/steam-runtime-sniper/pressure-vessel/bin/pv-verify

This checks both metadata and content of all of the files in there, so expect it to take up to 30 seconds on HDD, and perhaps longer on NFS.

You can also get an interactive shell inside the container by running:

~/.steam/root/ubuntu12_64/steam-runtime-sniper/run -- xterm

Inside that xterm, you should find that all the symbolic links in /etc/ssl/certs are working (they have a valid target, and if you use ls --color they will typically appear in cyan rather than red).

It would be useful for me to see a detailed log from the container runtime framework, which you can get by running:

STEAM_LINUX_RUNTIME_LOG=1 \
STEAM_LINUX_RUNTIME_VERBOSE=1 \
~/.steam/root/ubuntu12_64/steam-runtime-sniper/run -- xterm

(You can just exit from the xterm when it has opened, or run a simpler command like true.)

The log file will appear in steam-runtime-sniper/var/, with a symbolic link slr-latest.log that points to it.

smcv commented 10 months ago

I'd rather not move everything to a local filesystem as I have the NFS server for convenient backups and rolling snapshots

I'm sure that's desirable, but remote filesystems have functionality and performance characteristics that are very much unlike local filesystems, and we can't support every possible scenario.

As currently implemented, the whole steam-runtime-sniper/ directory is actually "expendable": it needs to exist while Steam is running, but it does not contain any user data, so the only thing that is lost when it's deleted (assuming it isn't in active use) is some time. If we can set up some sort of mechanism for redirecting this from its normal location onto a local disk, then that would make it faster and more robust for you, and would also avoid it wasting space and time in your backups.

At the moment the way it's implemented doesn't allow for it to be a symlink or a mount point, but I'll see whether that can become possible in future.

DataBeaver commented 10 months ago

The core problem is NFS doesn't support flock()

Does it support POSIX process-associated record locks (fcntl F_SETLKW) and/or Linux open-file-description locks (fcntl F_OFD_SETLKW)?

A quick look at the relevant manpages tells me that while NFS doesn't support flock natively, Linux can emulate it using fcntl locks, albeit with slightly different semantics. There's no mention if whether it's F_SETLKW or F_OFD_SETLKW, nor does the fcntl manpage say whether those two have any difference over NFS. I assume they work the same, since the only difference is that OFD locks will also block other file descriptors from the same process, and from the NFS protocol's point of view it doesn't matter if it's two different processes on the same client or two fds in the same process.

The "$adverb" --write --lock-file="$ref" -- true command (without --wait) should fail with error message "E: Unable to lock ... for writing: file is busy".

The last command (with --wait) should block, with no output, until you kill the sleep process (or wait 10 minutes for it to exit on its own), at which point the last command should exit successfully.

This works for me on my NFS home directory. As does Steam itself. So I think NFS is at most a contributing factor, not the root cause. It will be interesting to see glabifrons's results for the lock test. Could be that we have some configuration differences.

There I got the "not responding" dialog, but the Steam UI showed up and worked as well

Unfortunately, I think this is normal if it takes an unusually long time for the steamwebhelper to start. Steam cannot currently distinguish between multiple different reasons why the steamwebhelper might fail to start, and it also cannot currently distinguish between "it's taking a long time, but might still work" and "it will never work, however long we wait".

Understandable. It's a relatively minor annoyance, but if you want to do something about it, maybe add an option to wait a bit longer without restarting anything? Or even keep checking for responsiveness while the dialog is up, and hide it if steamwebhelper starts responding after all.

davispuh commented 10 months ago

Hmm I thought this is only issue for steamwebhelper crash due to new Beta but looks like there are several issues causing crashes...

My crash is not #10412 because my steam-runtime-sniper is complete without missing files and I also tried

$ rm -rf ~/.steam/root/ubuntu12_64/steam-runtime-sniper

but that didn't change anything and

$ ~/.steam/root/ubuntu12_64/steam-runtime-sniper/run-in-sniper vkcube

works fine without issues.

In logs nothing in particular stands out

$ cat ~/.steam/steam/logs/steamwebhelper.log
steamwebhelper.sh[43847]: Starting steamwebhelper with sniper steam runtime at /mnt/Games/SteamLinux/ubuntu12_64/steam-runtime-sniper
exec ./steamwebhelper --no-sandbox --no-sandbox -lang=en_US -cachedir=/mnt/Games/SteamLinux/config/htmlcache -steampid=42835 -buildid=1706390103 -steamid=xxx -logdir=/mnt/Games/SteamLinux/logs -uimode=7 -startcount=2 -steamuniverse=Public -realm=Global -clientui=/mnt/Games/SteamLinux/clientui -steampath=/mnt/Games/SteamLinux/ubuntu12_32/steam -launcher=0 -no-restart-on-ui-mode-change --enable-media-stream --enable-smooth-scrolling --password-store=basic --log-file=/mnt/Games/SteamLinux/logs/cef_log.txt --disable-quick-menu --disable-features=DcheckIsFatal
[0129/194030.399588:ERROR:context.cc(100)] The browser_subprocess_path directory (./steamwebhelper) is not an absolute path. Defaulting to empty.
[0129/194030.426813:WARNING:crash_reporting.cc(278)] Failed to set crash key: UserID with value: 0
[0129/194030.426864:WARNING:crash_reporting.cc(278)] Failed to set crash key: BuildID with value: 1706389061
[0129/194030.426867:WARNING:crash_reporting.cc(278)] Failed to set crash key: SteamUniverse with value: Public
[0129/194030.426870:WARNING:crash_reporting.cc(278)] Failed to set crash key: Vendor with value: Valve
[0129/194030.426873:WARNING:crash_reporting.cc(278)] Failed to set crash key: Platform with value: Linux
[0129/194030.427856:INFO:crash_reporting.cc(239)] Crash reporting enabled for process: browser
[0129/194030.429320:WARNING:task_impl.cc(32)] No task runner for threadId 0
[0129/194030.430817:WARNING:task_impl.cc(32)] No task runner for threadId 0
[0129/194030.457887:WARNING:crash_reporting.cc(278)] Failed to set crash key: UserID with value: xxx
[0129/194030.457969:WARNING:crash_reporting.cc(278)] Failed to set crash key: BuildID with value: 1706390103
[0129/194030.457973:WARNING:crash_reporting.cc(278)] Failed to set crash key: SteamUniverse with value: Public
[0129/194030.457976:WARNING:crash_reporting.cc(278)] Failed to set crash key: Vendor with value: Valve
[0129/194030.457980:WARNING:crash_reporting.cc(278)] Failed to set crash key: Platform with value: Linux
[0129/194030.461011:WARNING:crash_reporting.cc(278)] Failed to set crash key: UserID with value: xxx
[0129/194030.461099:WARNING:crash_reporting.cc(278)] Failed to set crash key: BuildID with value: 1706390103
[0129/194030.461102:WARNING:crash_reporting.cc(278)] Failed to set crash key: SteamUniverse with value: Public
[0129/194030.461105:WARNING:crash_reporting.cc(278)] Failed to set crash key: Vendor with value: Valve
[0129/194030.461108:WARNING:crash_reporting.cc(278)] Failed to set crash key: Platform with value: Linux
[0129/194031.302189:ERROR:file_io_posix.cc(144)] open /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq: No such file or directory (2)
[0129/194031.302222:ERROR:file_io_posix.cc(144)] open /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq: No such file or directory (2)
$ cat ~/.steam/steam/logs/webhelper.txt
[...]
[1970-01-01 03:00:00] Client version: no bootstrapper found
[1970-01-01 03:00:00] Startup - webhelper launched with: ./steamwebhelper --no-sandbox -lang=en_US -cachedir=/mnt/Games/SteamLinux/config/htmlcache -steampid=42835 -buildid=1706390103 -steamid=xxx -logdir=/mnt/Games/SteamLinux/logs -uimode=7 -startcount=2 -steamuniverse=Public -realm=Global -clientui=/mnt/Games/SteamLinux/clientui -steampath=/mnt/Games/SteamLinux/ubuntu12_32/steam -launcher=0 -no-restart-on-ui-mode-change --enable-media-stream --enable-smooth-scrolling --password-store=basic --log-file=/mnt/Games/SteamLinux/logs/cef_log.txt --disable-quick-menu --disable-features=DcheckIsFatal
[1970-01-01 03:00:00] Disabling sandbox due to a previous crash in CefInitialize with the sandbox enabled
[1970-01-01 03:00:00] Browser - launching child process with: /mnt/Games/SteamLinux/ubuntu12_64/steamwebhelper --type=zygote --no-zygote-sandbox --no-sandbox --user-agent-product=Valve Steam Client --lang=en_US.UTF-8 --log-file=/mnt/Games/SteamLinux/logs/cef_log.txt --crashpad-handler-pid=43969 --buildid=1706390103 --steamid=xxx
[1970-01-01 03:00:00] Browser - launching child process with: /mnt/Games/SteamLinux/ubuntu12_64/steamwebhelper --type=zygote --no-sandbox --user-agent-product=Valve Steam Client --lang=en_US.UTF-8 --log-file=/mnt/Games/SteamLinux/logs/cef_log.txt --crashpad-handler-pid=43969 --buildid=1706390103 --steamid=xxx

[1970-01-01 03:00:00] Client version: no bootstrapper found
[1970-01-01 03:00:00] Startup - webhelper launched with: ./steamwebhelper --no-sandbox --no-sandbox -lang=en_US -cachedir=/mnt/Games/SteamLinux/config/htmlcache -steampid=42835 -buildid=1706390103 -steamid=xxx -logdir=/mnt/Games/SteamLinux/logs -uimode=7 -startcount=2 -steamuniverse=Public -realm=Global -clientui=/mnt/Games/SteamLinux/clientui -steampath=/mnt/Games/SteamLinux/ubuntu12_32/steam -launcher=0 -no-restart-on-ui-mode-change --enable-media-stream --enable-smooth-scrolling --password-store=basic --log-file=/mnt/Games/SteamLinux/logs/cef_log.txt --disable-quick-menu --disable-features=DcheckIsFatal
[1970-01-01 03:00:00] Disabling sandbox due to a previous crash in CefInitialize with the sandbox enabled
[1970-01-01 03:00:00] Browser - launching child process with: /mnt/Games/SteamLinux/ubuntu12_64/steamwebhelper --type=zygote --no-zygote-sandbox --no-sandbox --user-agent-product=Valve Steam Client --lang=en_US.UTF-8 --log-file=/mnt/Games/SteamLinux/logs/cef_log.txt --crashpad-handler-pid=44120 --buildid=1706390103 --steamid=xxx
[1970-01-01 03:00:00] Browser - launching child process with: /mnt/Games/SteamLinux/ubuntu12_64/steamwebhelper --type=zygote --no-sandbox --user-agent-product=Valve Steam Client --lang=en_US.UTF-8 --log-file=/mnt/Games/SteamLinux/logs/cef_log.txt --crashpad-handler-pid=44120 --buildid=1706390103 --steamid=xxx
$ cat ~/.steam/root/logs/cef_log.txt
[0129/194027.772949:INFO:crash_reporting.cc(239)] Crash reporting enabled for process: browser
[0129/194027.774154:WARNING:task_impl.cc(32)] No task runner for threadId 0
[0129/194027.775646:WARNING:task_impl.cc(32)] No task runner for threadId 0
[0129/194030.427856:INFO:crash_reporting.cc(239)] Crash reporting enabled for process: browser
[0129/194030.429320:WARNING:task_impl.cc(32)] No task runner for threadId 0
[0129/194030.430817:WARNING:task_impl.cc(32)] No task runner for threadId 0

And here are backtraces but I don't know how to get symbols for it?

#0  0x000070f2f7d3f003 n/a (/mnt/Games/SteamLinux/ubuntu12_64/libcef.so + 0x8b3f003)
#1  0x000070f2f7d2dc3f n/a (/mnt/Games/SteamLinux/ubuntu12_64/libcef.so + 0x8b2dc3f)
#2  0x000070f2f372235c n/a (/mnt/Games/SteamLinux/ubuntu12_64/libcef.so + 0x452235c)
#3  0x000070f2f3c811c5 n/a (/mnt/Games/SteamLinux/ubuntu12_64/libcef.so + 0x4a811c5)
#4  0x000070f2f372269f n/a (/mnt/Games/SteamLinux/ubuntu12_64/libcef.so + 0x452269f)
#5  0x000070f2f3724dec n/a (/mnt/Games/SteamLinux/ubuntu12_64/libcef.so + 0x4524dec)
#6  0x000070f2f142cc3d n/a (/mnt/Games/SteamLinux/ubuntu12_64/libcef.so + 0x222cc3d)
#7  0x000070f2f14a84ac n/a (/mnt/Games/SteamLinux/ubuntu12_64/libcef.so + 0x22a84ac)
#8  0x000070f2f4fa63b0 n/a (/mnt/Games/SteamLinux/ubuntu12_64/libcef.so + 0x5da63b0)
#9  0x000070f2f4fa7670 n/a (/mnt/Games/SteamLinux/ubuntu12_64/libcef.so + 0x5da7670)
#10 0x000070f2f4fa7454 n/a (/mnt/Games/SteamLinux/ubuntu12_64/libcef.so + 0x5da7454)
#11 0x000070f2f4fa4c41 n/a (/mnt/Games/SteamLinux/ubuntu12_64/libcef.so + 0x5da4c41)
#12 0x000070f2f142bc32 n/a (/mnt/Games/SteamLinux/ubuntu12_64/libcef.so + 0x222bc32)
#13 0x000070f2f142b995 n/a (/mnt/Games/SteamLinux/ubuntu12_64/libcef.so + 0x222b995)
#14 0x000070f2f13fdac5 n/a (/mnt/Games/SteamLinux/ubuntu12_64/libcef.so + 0x21fdac5)
#15 0x000070f2f13fd654 n/a (/mnt/Games/SteamLinux/ubuntu12_64/libcef.so + 0x21fd654)
#16 0x000070f2f137336b cef_initialize (/mnt/Games/SteamLinux/ubuntu12_64/libcef.so + 0x217336b)
#17 0x00000000005d077a CefInitialize(CefMainArgs const&, CefStructBase<CefSettingsTraits> const&, scoped_refptr<CefApp>, void*) (steamwebhelper + 0x1d077a)
#18 0x0000000000519b1a CCEFThread::Init(int, char**, scoped_refptr<CefApp>) (steamwebhelper + 0x119b1a)
#19 0x0000000000517839 InitializeCef(int, char**, scoped_refptr<CefApp>) (steamwebhelper + 0x117839)
#20 0x0000000000590323 main (steamwebhelper + 0x190323)
#21 0x000070f2ee924cd0 n/a (/run/host/usr/lib/libc.so.6 + 0x27cd0)
#0  0x00007fefe313f003 n/a (/mnt/Game/SteamLinux/ubuntu12_64/libcef.so + 0x8b3f003)
#1  0x00007fefe312dc3f n/a (/mnt/Game/SteamLinux/ubuntu12_64/libcef.so + 0x8b2dc3f)
#2  0x00007fefe03a5494 n/a (/mnt/Game/SteamLinux/ubuntu12_64/libcef.so + 0x5da5494)
#3  0x00007fefe03a66ff n/a (/mnt/Game/SteamLinux/ubuntu12_64/libcef.so + 0x5da66ff)
#4  0x00007fefe03a7428 n/a (/mnt/Game/SteamLinux/ubuntu12_64/libcef.so + 0x5da7428)
#5  0x00007fefe03a4d46 n/a (/mnt/Game/SteamLinux/ubuntu12_64/libcef.so + 0x5da4d46)
#6  0x00007fefe03a4e3a n/a (/mnt/Game/SteamLinux/ubuntu12_64/libcef.so + 0x5da4e3a)
#7  0x00007fefdc82c391 n/a (/mnt/Game/SteamLinux/ubuntu12_64/libcef.so + 0x222c391)
#8  0x00007fefdc7fd517 n/a (/mnt/Game/SteamLinux/ubuntu12_64/libcef.so + 0x21fd517)
#9  0x00007fefdc773260 cef_execute_process (/mnt/Game/SteamLinux/ubuntu12_64/libcef.so + 0x2173260)
#10 0x00000000005d069f CefExecuteProcess(CefMainArgs const&, scoped_refptr<CefApp>, void*) (steamwebhelper + 0x1d069f)
#11 0x000000000058fc7e RealMain(int, char**) (steamwebhelper + 0x18fc7e)
#12 0x0000000000590e38 main (steamwebhelper + 0x190e38)
#13 0x00007fefd9d24cd0 n/a (/run/host/usr/lib/libc.so.6 + 0x27cd0)
smcv commented 10 months ago

Hmm I thought this is only issue for steamwebhelper crash due to new Beta but looks like there are several issues causing crashes...

The steamwebhelper has had several major changes in the new beta. Some issues caused by those changes (like #10412) are to do with the fact that it is now running inside a container runtime, like Counterstrike 2 and Dota 2 do. Others could be to do with changes inside the steamwebhelper itself.

At the moment, unfortunately I don't see enough information here to be able to say whether you are experiencing the same crash as the original reporter of this particular issue or not.

$ cat ~/.steam/steam/logs/webhelper.txt

In the beta that was active over the weekend, the log was truncated every time the steamwebhelper restarted, which was unhelpful because it meant that previous error messages could be lost. Please try updating Steam to the current beta 1706390103, which has stopped truncating the log every time, so should get you better logs.

You might need to do this by swapping to the stable branch (completely exit from Steam, rm ~/.steam/root/package/beta, start it again), and then back to the beta.

And here are backtraces but I don't know how to get symbols for it?

The general public cannot get debug symbols for the proprietary parts of Steam (and neither can I), but Valve can. In your original report, you quoted log output that said Steam has uploaded a crash dump, CrashID=bp-1e48b135-8144-4efd-a813-f1b892240126. That could be useful information to a Valve developer. If you can get a similar CrashID from the current beta, that will also help.

davispuh commented 10 months ago

All information in my comment is with latest Beta version (today installed), there you can see steamwebhelper was ran with -buildid=1706390103 parameter. Also I deleted logs folder before run so there isn't old info.

Looks like it crashes very early in startup as there isn't any other log entries after Browser - launching child process with.

In non-Beta version I see

[2024-01-29 21:16:20] CreateBrowser 512367304 type:12 flags:0  (-2147483648, -2147483648) 0x0

But this is not present in Beta we never reach it.

Crash seems to be inside libcef.so - Chromium Embedded Framework (CEF) which is open source so I think even with Valve's modifications it might be possible to match it up with relevant functions with some reversing but that's not our job.

So they need to look into this. My latest CrashID=bp-5aaf6174-a286-49b8-a3b3-eb2fe2240129

glabifrons commented 10 months ago

@smcv I tried the adverb commands both on the local filesystem and on the NFS installation, and surprisingly it worked for both. So it looks like it's not a locking issue.

Just in case the information is useful though, this post has the best description of the limitations of the NFSv4 calls (IIRC, there's one for read and one for write, but none for both): https://github.com/ValveSoftware/steam-for-linux/issues/5788#issuecomment-511219089 Two posts below that, @eqvinox has a short explanation of how the Linux kernel translates the calls (so it does add that, but they don't look like the ones you asked about).

Another thing you mentioned is compatibility for old kernels. I may actually have the opposite problem, as I'm running the HWE kernel: 6.5.0-15-generic. I wonder if there might be negative interaction with the newer kernels. @davispuh what kernel are you using?

I ran pv-verify in both installations, and on my NVME drive it took 2.8s, and on NFS it took 5.5s. Both came back verified on both checks it performs.

I tried your other command to launch an xterm within sniper and verified all symlinks look good.

I tried to generate a log for you as requested, but no logs ever appear in either the var under steam on NFS or the local installation. Only the .ref file and a temp subdirectory (tmp-$randomchars) in my local installation and several of those subdirs in my NFS one. I updated the beta to the latest (released on the 26th... how do you identify the version number?) and the results were the same - no log. I tried exporting the variables (thinking maybe there was another subshell being triggered), same results, no log in steam-runtime-sniper/var. Since I couldn't generate those logs (in local or NFS installations), I re-collected the main logs as initially requested after the above mentioned upgrade to the latest beta. steam-logs-jan26beta.tar.gz

I like your thoughts on relocation. Sniper is well under 1GB, so /tmp or /var/tmp would likely work for most (and /tmp would make the most sense). It could be a manual process with links done by experienced users, or maybe even something from a path entry dialog down the road labeled something like "Copy temporary runtime to local filesystem" available only if the system detects that it's running in a remote filesystem.

I do have one thought that I hope is a stupid question: Steam doesn't attempt to launch anything as root, does it? I ask as if it does so in my home directory, it will fail as root is "squashed" on NFS by default (its a security precaution). Any actions a root from a client show up as user "nobody" from the server's perspective, so will fail any operation on any path that's not world-read/execute (at a minimum) perms all the way up.

One other thing... you mentioned you don't have access to debug symbols, but Valve can... with as much effort as you're putting into this and as knowledgeable as you are on how the inner workings and even the development direction, I thought you were an employee of Valve!

glabifrons commented 10 months ago

I was thinking about NFS and root-squash and the overlays and I think I figured out at least part of what's going on. From within the sniper shell (the xterm launched per your instructions), all device files (the entire output of ls -l /dev/) are shown as owned by nobody:nogroup.

I don't pretend to understand what type of container sniper is using or how it's overlaying filesystems, but I find this really strange as up until the 20th, not only did it work, but many of the games I play use Proton (I recall you saying sniper is related in the other issue thread) and I even use Proton Experimental for some of them, like Space Engineers).

smcv commented 10 months ago

It's still difficult to tell from the information available, but the best diagnosis I can make so far is that @glabifrons might be seeing a steamwebhelper crash that is not directly related to the introduction of the container runtime, similar to @davispuh. If that's the case, then a Steam developer will have to take over investigating this.

@glabifrons, if you can find a CrashID in recent Steam output (it will look similar to the one @davispuh provided) then that would probably be useful information for a Steam developer. Equally, if you can't find a CrashID anywhere, that would also be an interesting data point - it might imply that I'm wrong about this being a steamwebhelper crash.

I still think that [0129/213808.640966:ERROR:nss_util.cc(357)] After loading Root Certs, loaded==false: NSS error code: -8018 in your cef_log.txt might be significant, but I don't know why that would happen to you but not to others. It sounds as though the CA certificates in the sniper container are set up correctly (their symbolic links are not broken in the container environment). A container runtime verbose log might help to figure out what is different about your system.

@davispuh, am I correct to think that you don't see NSS error code: -8018 in your cef_log.txt or other logs?

I tried to generate a log for you as requested, but no logs ever appear in either the var under steam on NFS or the local installation

Sorry, I was forgetting which layer is responsible for implementing slr-latest.log. The command I should have suggested is:

STEAM_LINUX_RUNTIME_LOG=1 \
STEAM_LINUX_RUNTIME_VERBOSE=1 \
~/.steam/root/ubuntu12_64/steam-runtime-sniper/_v2-entry-point -- xterm

This should record a log like I said.

Another way to get logs would be to exit from Steam completely, and then run it as:

STEAM_LINUX_RUNTIME_LOG=1 \
STEAM_LINUX_RUNTIME_KEEP_LOGS=1 \
STEAM_LINUX_RUNTIME_VERBOSE=1 \
steam

which should record one log in steam-runtime-sniper/var each time it tries to restart the steamwebhelper, with slr-latest.log always pointing to the newest one.

the .ref file and a temp subdirectory (tmp-$randomchars) in my local installation and several of those subdirs in my NFS one

This is interesting. Normally (if you don't use STEAM_LINUX_RUNTIME_KEEP_LOGS=1), the container runtime is meant to delete old var/tmp-XXXXXXX subdirectories during startup - so at any given time, you should usually only have one.

It sounds as though your installation on a local disk is working correctly, but the old subdirectories are not being garbage-collected on your NFS installation. I'd be interested to see why not. If you can get a log file, it should tell us why.

how do you identify the version number?

Normally you would use Help → About Steam, but when your issue is that the UI isn't starting, obviously that isn't an option.

As @davispuh said, one good indication is that each time Steam runs the steamwebhelper, it passes it an argument -buildid=1706390103 which gives the build ID. Many of the other logs also mention the build ID, for example [2024-01-29 21:36:16] Client version: 1706390103 in console_log.txt and [2024-01-29 21:37:22] Download skipped: /steam_client_publicbeta_ubuntu12?t=3744699689 version 1706390103, installed version 1706390103, existing pending version 0 in bootstrap_log.txt.

I tried the adverb commands both on the local filesystem and on the NFS installation, and surprisingly it worked for both

OK, good. It sounds as though we cannot rely on flock(1) or flock(2) on NFS, but the fcntl locking that we already use should be safe. I'll bear that in mind for future work on this topic.

From within the sniper shell (the xterm launched per your instructions), all device files (the entire output of ls -l /dev/) are shown as owned by nobody:nogroup.

This may seem weird, but is normal. When an unprivileged user creates a new user namespace, as we do in the Steam container runtime, the kernel will only allow us to create one user ID mapping (our own user ID) and one group ID mapping (our own primary group ID). Everything else is mapped to the "overflow uid" and "overflow gid" (normally nobody:nogroup), very similar to NFS root-squashing. Files owned by root and files owned by any other user will show up inside the container as though they were owned by the overflow uid, which you should interpret as meaning "owned by someone who is not me".

Flatpak apps have the same behaviour, for the same reason.

Steam doesn't attempt to launch anything as root, does it?

Not usually, and not on the critical path for basic UI functionality. In some situations (mainly either related to VR, or on the Steam Deck) it will try to run commands via pkexec.

Another thing you mentioned is compatibility for old kernels. I may actually have the opposite problem, as I'm running the HWE kernel: 6.5.0-15-generic.

My main test environments for new container runtime releases are Ubuntu 22.04 (with the same HWE kernel you're using) and Arch (with a very new kernel, currently 6.7), so it's heavily tested on modern kernels.

I thought you were an employee of Valve

I'm a consultant helping them with the Steam Runtime and related topics. If your particular issue is a problem with the container runtime, my team might be able to fix it; if it's a problem with steamwebhelper itself, probably someone else will need to take over.

glabifrons commented 10 months ago

I just launched the beta in the foreground again and got two crash-IDs.

assert_20240130192454_27.dmp[1678661]: response: CrashID=bp-76d14b36-100b-48a3-aec4-615e12240130
assert_20240130192454_27.dmp[1678661]: file ''/tmp/dumps/assert_20240130192454_27.dmp'', upload yes: ''CrashID=bp-76d14b36-100b-48a3-aec4-615e12240130''
assert_20240130192527_134.dmp[1679205]: response: CrashID=bp-722467e1-6953-4b8a-9ad0-01f382240130
assert_20240130192527_134.dmp[1679205]: file ''/tmp/dumps/assert_20240130192527_134.dmp'', upload yes: ''CrashID=bp-722467e1-6953-4b8a-9ad0-01f382240130''

I can provide more output for context if needed.

I was able to get a log using the _v2-entry-point, so thank you for the correction. Interestingly, it deletes the old log when it creates a new one, but still doesn't clean up the tmp-$random directories. This one was just using true as the target executable. slr-non-steam-game-t20240130T203006.log

Thank you very much for the overflow uid explanation. That's a huge relief that it's not what I was afraid of, as that would have meant that solution wasn't NFS compatible. I'm glad you mentioned Steam's VR... I guess I'll be putting off playing with that for a while (was looking at it recently due to a deal on woot that I almost bought).

Hopefully the above log is helpful, but if not, we now have crashdumps as well.

glabifrons commented 10 months ago

Sifting through the log, I find the error about not finding libvdpau.so.1 to be interesting, as it placed a copy into Steam/ubuntu12_64/steam-runtime-sniper/var/tmp-92DII2/usr/lib/pressure-vessel/overrides/lib/x86_64-linux-gnu/. I'm guessing it copied it from Steam/ubuntu12_64/steam-runtime-sniper/sniper_platform_0.20240125.75305/files/lib/x86_64-linux-gnu/libvdpau.so.1.0.0 (note the extra ".0.0" on the end, I'm not sure if it's expecting an *.so.1 in that location or where else it's looking where it's not seeing it).

I saw it clean out the tmp dirs then give a errors that they're not empty. Most were empty by the time I looked. I did an rmdir * in the var under sniper and it removed most of them (7 remain of 19 that were there before). This could be purely a timing/sync issue with a file being removed server-side. A 1 second sleep should be more than enough to solve that. As issues go, that's incredibly minor.

Other than that, I don't see anything jumping out at me in that log. You'll know better than I what to look for though.

smcv commented 10 months ago

Sifting through the log, I find the error about not finding libvdpau.so.1 to be interesting, as it placed a copy into Steam/ubuntu12_64/steam-runtime-sniper/var/tmp-92DII2/usr/lib/pressure-vessel/overrides/lib/x86_64-linux-gnu

Probably it found that your host system had a 64-bit libvdpau.so.1 but not a 32-bit libvdpau.so.1. Ideally you will want both of those on Nvidia-based systems, to get hardware acceleration for video encoding and decoding in both 64- and 32-bit processes. On Ubuntu this means installing both libvdpau1:amd64 and libvdpau1:i386. This is worth fixing, but probably not what is causing your crash, since the steamwebhelper is a 64-bit process that will be unaffected by the absence of a 32-bit library.

. I'm guessing it copied it from Steam/ubuntu12_64/steam-runtime-sniper/sniper_platform_0.20240125.75305/files/lib/x86_64-linux-gnu/libvdpau.so.1.0.0

No, anything that is created in usr/lib/pressure-vessel/overrides is a symbolic link pointing to graphics-stack-related files from your host system (via the /run/host mount point inside the container).

I saw it clean out the tmp dirs then give a errors that they're not empty.[...] This could be purely a timing/sync issue with a file being removed server-side. A 1 second sleep should be more than enough to solve that.

Sorry, I am not going to slow down each container startup for every Steam-on-Linux user just to benefit NFS users. If it's leaving behind nearly-empty directories, then the disk space cost is trivially small.

I suspect that what might be happening here is that we're deleting the directory tmp-XXXXXX while the file tmp-XXXXXX/.ref is open and locked. POSIX guarantees that we can delete files while they are still open, but NFS implements deletion of open files by renaming them to some weird name that will be removed when the file is eventually closed, and then that means the directory isn't empty, so rmdir() refuses to delete it.

smcv commented 10 months ago

Please could a Valve developer look up the backtraces for the two crash IDs referenced by @glabifrons in https://github.com/ValveSoftware/steam-for-linux/issues/10431#issuecomment-1918241124

assert_20240130192454_27.dmp[1678661]: response: CrashID=bp-76d14b36-100b-48a3-aec4-615e12240130
assert_20240130192527_134.dmp[1679205]: response: CrashID=bp-722467e1-6953-4b8a-9ad0-01f382240130

and check whether they are the same thing that @davispuh is experiencing, which is this?

CrashID=bp-5aaf6174-a286-49b8-a3b3-eb2fe2240129
smcv commented 10 months ago

From the log in https://github.com/ValveSoftware/steam-for-linux/issues/10431#issuecomment-1918241124, we are likely to be using the container's root CA certificates (derived from Debian 11's ca-certificates package, and available in /etc/ssl/) rather than the host's root CA certificates. This makes it odd that @glabifrons is getting an error message relating to root CA certificates, but I'm not: if there was a problem with the container's /etc/ssl, I would expect it to affect everyone, including me.

@glabifrons is using Ubuntu, which is Debian-derived, so this is not a mismatch between Debian and e.g. Fedora search paths for root CA certificates, or anything like that.

A potentially interesting factor is that we have pulled in /lib/x86_64-linux-gnu/libnvidia-pkcs11-openssl3.so.545.29.06 as part of the graphics stack, which has OpenSSL 3 /lib/x86_64-linux-gnu/libcrypto.so.3 as a dependency. This means that code inside the container can see all of these OpenSSL libraries:

so one possible factor for Valve developers to investigate would be whether these can somehow collide?

[Edited to add: the fact that https://github.com/ValveSoftware/steam-for-linux/issues/10431#issuecomment-1924106225 didn't solve this, according to https://github.com/ValveSoftware/steam-for-linux/issues/10431#issuecomment-1925158103, suggests that this probably wasn't the problem.]

smcv commented 10 months ago

@glabifrons, if you are comfortable with using unreleased software, one thing you could try is:

If that makes it work, then my theory about libnvidia-pkcs11-openssl3.so.* in https://github.com/ValveSoftware/steam-for-linux/issues/10431#issuecomment-1919256799 is probably going in the right direction. If it still fails with that modified version of pressure-vessel, then my theory was probably wrong, but that could still be a useful data point for someone with more insight into the proprietary parts of Steam.

glabifrons commented 10 months ago

@smcv Thank you very much for coming up with more ways to narrow this down. Unfortunately, I tried it and still got the crash dialog.

glabifrons commented 9 months ago

It looks like the change that causes this crash was just pushed to release, as my entire family is now getting this crash screen and they do not run the beta. I just updated Steam and now I'm getting the crash in the released version too.

glabifrons commented 9 months ago

Crash dumps (2) from my kid's system:

assert_20240227194154_28.dmp[4071927]: Uploading dump (out-of-process)
/tmp/dumps/assert_20240227194154_28.dmp
assert_20240227194154_28.dmp[4071927]: Finished uploading minidump (out-of-process): success = yes
assert_20240227194154_28.dmp[4071927]: response: CrashID=bp-1c93b6a7-b308-4df1-9465-5fdf92240227
assert_20240227194154_28.dmp[4071927]: file ''/tmp/dumps/assert_20240227194154_28.dmp'', upload yes: ''CrashID=bp-1c93b6a7-b308-4df1-9465-5fdf92240227''
BRefreshApplicationsInLibrary 1: 2ms
src/steamUI/steamuisharedjscontroller.cpp (546) : Failed creating offscreen shared JS context
src/steamUI/steamuisharedjscontroller.cpp (546) : Failed creating offscreen shared JS context
src/common/html/chrome_ipc_client.cpp (1111) : Failed to connect to master html process, created shared memory (spawn time 60.02)
src/common/html/chrome_ipc_client.cpp (1111) : Failed to connect to master html process, created shared memory (spawn time 60.02)
02/27 19:42:26 Init: Installing breakpad exception handler for appid(steam)/version(1708985249)/tid(4071705)
assert_20240227194226_38.dmp[4072137]: Uploading dump (out-of-process)
/tmp/dumps/assert_20240227194226_38.dmp
assert_20240227194226_38.dmp[4072137]: Finished uploading minidump (out-of-process): success = yes
assert_20240227194226_38.dmp[4072137]: response: CrashID=bp-ce3828f7-3c00-4890-ae03-e08652240227
assert_20240227194226_38.dmp[4072137]: file ''/tmp/dumps/assert_20240227194226_38.dmp'', upload yes: ''CrashID=bp-ce3828f7-3c00-4890-ae03-e08652240227''
[2024-02-27 19:42:50] Shutdown

Crash dumps (3) from my wife's system:

assert_20240227192652_27.dmp[725933]: Finished uploading minidump (out-of-process): success = yes
assert_20240227192652_27.dmp[725933]: response: CrashID=bp-f3691d6b-88e6-4e81-a9e2-28b6e2240227
assert_20240227192652_27.dmp[725933]: file ''/tmp/dumps/assert_20240227192652_27.dmp'', upload yes: ''CrashID=bp-f3691d6b-88e6-4e81-a9e2-28b6e2240227''
BRefreshApplicationsInLibrary 1: 2ms
src/steamUI/steamuisharedjscontroller.cpp (546) : Failed creating offscreen shared JS context
src/steamUI/steamuisharedjscontroller.cpp (546) : Failed creating offscreen shared JS context
src/common/html/chrome_ipc_client.cpp (1111) : Failed to connect to master html process, created shared memory (spawn time 60.02)
src/common/html/chrome_ipc_client.cpp (1111) : Failed to connect to master html process, created shared memory (spawn time 60.02)
02/27 19:27:24 Init: Installing breakpad exception handler for appid(steam)/version(1708985249)/tid(725773)
assert_20240227192724_39.dmp[726038]: Uploading dump (out-of-process)
/tmp/dumps/assert_20240227192724_39.dmp
assert_20240227192724_39.dmp[726038]: Finished uploading minidump (out-of-process): success = yes
assert_20240227192724_39.dmp[726038]: response: CrashID=bp-ac33e824-b4d4-4235-a03e-f6d682240227
assert_20240227192724_39.dmp[726038]: file ''/tmp/dumps/assert_20240227192724_39.dmp'', upload yes: ''CrashID=bp-ac33e824-b4d4-4235-a03e-f6d682240227''
src/clientdll/configstore.cpp (128) : ConfigStore (UserLocalConfigStore) is dirty, and being destroyed, we're discarding data
src/clientdll/configstore.cpp (128) : ConfigStore (UserLocalConfigStore) is dirty, and being destroyed, we're discarding data
assert_20240227192809_42.dmp[726105]: Uploading dump (out-of-process)
/tmp/dumps/assert_20240227192809_42.dmp
assert_20240227192809_42.dmp[726105]: Finished uploading minidump (out-of-process): success = yes
assert_20240227192809_42.dmp[726105]: response: CrashID=bp-8b4472ce-df9b-4a5d-b93f-3a9842240227
assert_20240227192809_42.dmp[726105]: file ''/tmp/dumps/assert_20240227192809_42.dmp'', upload yes: ''CrashID=bp-8b4472ce-df9b-4a5d-b93f-3a9842240227''
[2024-02-27 19:28:10] Shutdown

I can generate more from mine too if it will help.

anansivanir commented 9 months ago

Looks like it activated on Fedora 39 also last night and steamwebhelper doesn't start anymore.

From steamwebhelper.log

steamwebhelper.sh[62847]: === Wed Feb 28 09:59:03 AM EET 2024 ===
steamwebhelper.sh[62847]: Starting steamwebhelper under bootstrap sniper steam runtime at /data/home/user/.local/share/Steam/ubuntu12_64/steam-runtime-sniper
exec ./steamwebhelper --no-sandbox -lang=en_US -cachedir=/data/home/user/.local/share/Steam/config/htmlcache -steampid=57800 -buildid=1708985249 -steamid=76561198176790325 -logdir=/data/home/user/.local/share/Steam/logs -uimode=7 -startcount=14 -steamuniverse=Public -realm=Global -clientui=/data/home/user/.local/share/Steam/clientui -steampath=/data/home/user/.local/share/Steam/ubuntu12_32/steam -launcher=0 -use_safe_shutdown_workaround -use_xcomposite_workaround -no-restart-on-ui-mode-change --enable-smooth-scrolling --disable-gpu-compositing --disable-gpu --password-store=basic --log-file=/data/home/user/.local/share/Steam/logs/cef_log.txt --disable-quick-menu --disable-features=DcheckIsFatal
[0228/095904.170782:ERROR:context.cc(100)] The browser_subprocess_path directory (./steamwebhelper) is not an absolute path. Defaulting to empty.
[0228/095904.186298:WARNING:crash_reporting.cc(278)] Failed to set crash key: UserID with value: 0
[0228/095904.186333:WARNING:crash_reporting.cc(278)] Failed to set crash key: BuildID with value: 1708962035
[0228/095904.186337:WARNING:crash_reporting.cc(278)] Failed to set crash key: SteamUniverse with value: Public
[0228/095904.186340:WARNING:crash_reporting.cc(278)] Failed to set crash key: Vendor with value: Valve
[0228/095904.186343:WARNING:crash_reporting.cc(278)] Failed to set crash key: Platform with value: Linux
[0228/095904.186606:INFO:crash_reporting.cc(239)] Crash reporting enabled for process: browser
[0228/095904.187351:WARNING:task_impl.cc(32)] No task runner for threadId 0
[0228/095904.188050:WARNING:task_impl.cc(32)] No task runner for threadId 0
[0228/095904.204115:WARNING:crash_reporting.cc(278)] Failed to set crash key: UserID with value: 76561198176790325
[0228/095904.204179:WARNING:crash_reporting.cc(278)] Failed to set crash key: BuildID with value: 1708985249
[0228/095904.204183:WARNING:crash_reporting.cc(278)] Failed to set crash key: SteamUniverse with value: Public
[0228/095904.204147:WARNING:crash_reporting.cc(278)] Failed to set crash key: UserID with value: 76561198176790325
[0228/095904.204186:WARNING:crash_reporting.cc(278)] Failed to set crash key: Vendor with value: Valve
[0228/095904.204193:WARNING:crash_reporting.cc(278)] Failed to set crash key: Platform with value: Linux
[0228/095904.204192:WARNING:crash_reporting.cc(278)] Failed to set crash key: BuildID with value: 1708985249
[0228/095904.204197:WARNING:crash_reporting.cc(278)] Failed to set crash key: SteamUniverse with value: Public
[0228/095904.204200:WARNING:crash_reporting.cc(278)] Failed to set crash key: Vendor with value: Valve
[0228/095904.204203:WARNING:crash_reporting.cc(278)] Failed to set crash key: Platform with value: Linux

The only ERROR I could find was

steamwebhelper.log:[0228/095904.170782:ERROR:context.cc(100)] The browser_subprocess_path directory (./steamwebhelper) is not an absolute path. Defaulting to empty.

webhelper.txt states

[1970-01-01 02:00:00] Client version: no bootstrapper found
[1970-01-01 02:00:00] Startup - webhelper launched with: ./steamwebhelper --no-sandbox -lang=en_US -cachedir=/data/home/user/.local/share/Steam/config/htmlcache -steampid=57800 -buildid=1708985249 -steamid=76561198176790325 -logdir=/data/home/user/.local/share/Steam/logs -uimode=7 -startcount=14 -steamuniverse=Public -realm=Global -clientui=/data/home/user/.local/share/Steam/clientui -steampath=/data/home/user/.local/share/Steam/ubuntu12_32/steam -launcher=0 -use_safe_shutdown_workaround -use_xcomposite_workaround -no-restart-on-ui-mode-change --enable-smooth-scrolling --disable-gpu-compositing --disable-gpu --password-store=basic --log-file=/data/home/user/.local/share/Steam/logs/cef_log.txt --disable-quick-menu --disable-features=DcheckIsFatal
[1970-01-01 02:00:00] Disabling GPU acceleration due to --disable-gpu-compositing (browser)
[1970-01-01 02:00:00] Disabling sandbox due to a previous crash in CefInitialize with the sandbox enabled
[1970-01-01 02:00:00] Browser - launching child process with: /data/home/user/.local/share/Steam/ubuntu12_64/steamwebhelper --type=zygote --no-zygote-sandbox --no-sandbox --user-agent-product=Valve Steam Client --lang=en_US.UTF-8 --log-file=/data/home/user/.local/share/Steam/logs/cef_log.txt --crashpad-handler-pid=62977 --buildid=1708985249 --steamid=76561198176790325
[1970-01-01 02:00:00] Browser - launching child process with: /data/home/user/.local/share/Steam/ubuntu12_64/steamwebhelper --type=zygote --no-sandbox --user-agent-product=Valve Steam Client --lang=en_US.UTF-8 --log-file=/data/home/user/.local/share/Steam/logs/cef_log.txt --crashpad-handler-pid=62977 --buildid=1708985249 --steamid=76561198176790325

The funny part seems to be the timestamp of log-entries? 1970 instead the system local time Wed Feb 28 10:05:33 AM EET 2024

Console output from startup below for reference in case it's useful

$ steam
steam.sh[68823]: Running Steam on fedora 39 64-bit
steam.sh[68823]: STEAM_RUNTIME is enabled automatically
setup.sh[68896]: Steam runtime environment up-to-date!
steam.sh[68823]: Steam client's requirements are satisfied
tid(68939) burning pthread_key_t == 0 so we never use it
[2024-02-28 10:35:31] Startup - updater built Feb 26 2024 15:40:54
[2024-02-28 10:35:31] Startup - Steam Client launched with: '/data/home/user/.local/share/Steam/ubuntu12_32/steam'
02/28 10:35:31 Init: Installing breakpad exception handler for appid(steam)/version(1708985249)/tid(68939)
[2024-02-28 10:35:31] Loading cached metrics from disk (/data/home/user/.local/share/Steam/package/steam_client_metrics.bin)
[2024-02-28 10:35:31] Using the following download hosts for Public, Realm steamglobal
[2024-02-28 10:35:31] 1. https://client-update.akamai.steamstatic.com, /, Realm 'steamglobal', weight was 1000, source = 'update_hosts_cached.vdf'
[2024-02-28 10:35:31] 2. https://cdn.cloudflare.steamstatic.com, /client/, Realm 'steamglobal', weight was 1, source = 'update_hosts_cached.vdf'
[2024-02-28 10:35:31] 3. https://cdn.steamstatic.com, /client/, Realm 'steamglobal', weight was 1, source = 'baked in'
[2024-02-28 10:35:31] Verifying installation...
[2024-02-28 10:35:31] Verification complete

Steam logging initialized: directory: /data/home/user/.local/share/Steam/logs

XRRGetOutputInfo Workaround: initialized with override: 0 real: 0xf03e9170
XRRGetCrtcInfo Workaround: initialized with override: 0 real: 0xf03e7880
steamwebhelper.sh[68952]: === Wed Feb 28 10:35:32 AM EET 2024 ===
steamwebhelper.sh[68952]: Starting steamwebhelper under bootstrap sniper steam runtime at /data/home/user/.local/share/Steam/ubuntu12_64/steam-runtime-sni
per
CAppInfoCacheReadFromDiskThread took 37 milliseconds to initialize
Steam Runtime Launch Service: starting steam-runtime-launcher-service
Steam Runtime Launch Service: steam-runtime-launcher-service is running pid 69112
bus_name=com.steampowered.PressureVessel.LaunchAlongsideSteam
steamwebhelper.sh[69455]: === Wed Feb 28 10:35:42 AM EET 2024 ===
steamwebhelper.sh[69455]: Starting steamwebhelper under bootstrap sniper steam runtime at /data/home/user/.local/share/Steam/ubuntu12_64/steam-runtime-sni
per
steamwebhelper.sh[69787]: === Wed Feb 28 10:35:52 AM EET 2024 ===
steamwebhelper.sh[69787]: Starting steamwebhelper under bootstrap sniper steam runtime at /data/home/user/.local/share/Steam/ubuntu12_64/steam-runtime-sni
per
src/steamUI/steamuisharedjscontroller.cpp (546) : Failed creating offscreen shared JS context
src/steamUI/steamuisharedjscontroller.cpp (546) : Failed creating offscreen shared JS context
02/28 10:35:53 Init: Installing breakpad exception handler for appid(steam)/version(1708985249)/tid(68939)
assert_20240228103553_26.dmp[69925]: Uploading dump (out-of-process)
/tmp/dumps/assert_20240228103553_26.dmp
/usr/share/themes/Adwaita/gtk-2.0/main.rc:733: error: unexpected identifier 'direction', expected character '}'
/usr/share/themes/Adwaita/gtk-2.0/hacks.rc:28: error: invalid string constant "normal_entry", expected valid string constant
assert_20240228103553_26.dmp[69925]: Finished uploading minidump (out-of-process): success = yes
assert_20240228103553_26.dmp[69925]: response: CrashID=bp-f44023a9-c5be-4169-bcde-6f24c2240228
assert_20240228103553_26.dmp[69925]: file ''/tmp/dumps/assert_20240228103553_26.dmp'', upload yes: ''CrashID=bp-f44023a9-c5be-4169-bcde-6f24c2240228''
BRefreshApplicationsInLibrary 1: 1ms
steamwebhelper.sh[70150]: === Wed Feb 28 10:36:02 AM EET 2024 ===
steamwebhelper.sh[70150]: Starting steamwebhelper under bootstrap sniper steam runtime at /data/home/user/.local/share/Steam/ubuntu12_64/steam-runtime-sni
per
steamwebhelper.sh[70483]: === Wed Feb 28 10:36:12 AM EET 2024 ===
steamwebhelper.sh[70483]: Starting steamwebhelper under bootstrap sniper steam runtime at /data/home/user/.local/share/Steam/ubuntu12_64/steam-runtime-sni
per
src/steamUI/steamuisharedjscontroller.cpp (546) : Failed creating offscreen shared JS context
src/steamUI/steamuisharedjscontroller.cpp (546) : Failed creating offscreen shared JS context
reaping pid: 69787 -- unknown
src/steamexe/main.cpp (264) : Assertion Failed: ReapProcess: waitid failed: 'No child processes'. Possibly leaking a zombie.

src/steamexe/main.cpp (264) : Assertion Failed: ReapProcess: waitid failed: 'No child processes'. Possibly leaking a zombie.

02/28 10:36:18 Init: Installing breakpad exception handler for appid(steam)/version(1708985249)/tid(68939)
assert_20240228103618_41.dmp[70821]: Uploading dump (out-of-process)
/tmp/dumps/assert_20240228103618_41.dmp
assert_20240228103618_41.dmp[70821]: Finished uploading minidump (out-of-process): success = yes
assert_20240228103618_41.dmp[70821]: response: CrashID=bp-bf7c1a81-5ec6-4e58-b63d-6f6bc2240228
assert_20240228103618_41.dmp[70821]: file ''/tmp/dumps/assert_20240228103618_41.dmp'', upload yes: ''CrashID=bp-bf7c1a81-5ec6-4e58-b63d-6f6bc2240228''
[2024-02-28 10:36:24] Shutdown

Dmesg also seems to show some errors related to steam

... [ 9218.130122] traps: steamwebhelper[72547] trap int3 ip:7fd7c552f375 sp:7ffc7c2187f0 error:0 in libcef.so[7fd7bf336000+91a8000] [ 9228.209527] traps: steamwebhelper[72937] trap int3 ip:7f0847f2f375 sp:7fff65725410 error:0 in libcef.so[7f0841d36000+91a8000] [ 9238.381259] traps: steamwebhelper[73276] trap int3 ip:7ff54af2f375 sp:7ffe72f79c30 error:0 in libcef.so[7ff544d36000+91a8000] [ 9248.740349] traps: steamwebhelper[73644] trap int3 ip:7f433fd2f375 sp:7ffdd7cb33c0 error:0 in libcef.so[7f4339b36000+91a8000] [ 9248.740355] traps: steamwebhelper[73643] trap int3 ip:7f38b252f375 sp:7ffcc27ba750 error:0 in libcef.so[7f38ac336000+91a8000] [ 9258.766632] traps: steamwebhelper[73977] trap int3 ip:7f276632f375 sp:7ffc006d1fd0 error:0 [ 9258.766633] traps: steamwebhelper[73976] trap int3 ip:7faa1872f375 sp:7ffe7d3b1e40 error:0 in libcef.so[7faa12536000+91a8000] [ 9258.766636] in libcef.so[7f2760136000+91a8000]

...

Apparently if I just leave the error message window open, the steam appears to be running on the background without GUI and running second client from my laptop with AMD GPU seems to be able to stream games from my workstation with NVidia GPU i.e. it looks like the GUI is broken on NVidia hardware. My laptop is not fast enough to actually play anything, but steam seems to be responding to commands remotely and I seem to be able to launch games just fine.

davispuh commented 9 months ago

Yep this is retarded, Steam now is completely broken without any workarounds even when bug was reported month ago in Beta and without fixing it got promoted to prod...

To me it looks like these are this same thing #10538 #10541 #10539

image

Charlie-Root commented 9 months ago

As of this morning i have the exact same after it updated steam (not beta). Arch Linux, Intel RTX3090

StowasserH commented 9 months ago

Same problem here!

davispuh commented 9 months ago

Are you all using Wayland? I wonder if this is Wayland specific bug...

anansivanir commented 9 months ago

Are you all using Wayland? I wonder if this is Wayland specific bug...

At least I'm running the old X11

TTimo commented 9 months ago

@glabifrons - thank you for your detailed report. I assume the crash issue does not go away if you pass -cef-disable-gpu on the command line?

TTimo commented 9 months ago

@glabifrons - it's probably a good idea to test latest without NFS in the loop. You may have a different issue with the new chromium setup but we'd need to make sure it's not something due to your NFS configuration first.

TTimo commented 9 months ago

For other folks piling in on here, please consider making a clean separate report including logs and crash IDs, especially if you don't use NFS and your problems started with the client that shipped yesterday.

jhaand commented 9 months ago

It seems that the regular Steam now also uses the Sniper runtime and Steam has stopped working for me.

I can run the ./run-in-sniper vkcube

But it remains impossible for me to start Steam in Debian Testing using KDE with a Radeon graphics card.

The Flatpak edition also seems to start.

Edit: My laptop with Intel UHD 620 also fails to start with the same error after an update.

anansivanir commented 9 months ago

It seems that the regular Steam now also uses the Sniper runtime and Steam has stopped working for me.

I can run the ./run-in-sniper vkcube

But it remains impossible for me to start Steam in Debian Testing using KDE with a Radeon graphics card.

My laptop that uses and Intel UHD 620 still starts up the Steam client. The Flatpak edition also seems to start.

Same here, run-in-sniper works, but GUI fails. Laptop with AMD GPU has no issues.

./run-in-sniper vkcube
Selected GPU 0: NVIDIA GeForce RTX 4060 Ti, type: DiscreteGpu
mazirah commented 9 months ago

I think I have the same issue

Screenshot from 2024-02-29 09-57-05

Finn1986Git commented 9 months ago

Today at 07:53am CET I contacted the steam support which approx. 60 minutes later pointed/directed me here, because direct help with the steam support team is only provided when problems occur on an Ubuntu LTS-Installation. Since I am working on a Solus 4.5 64-bit System with the GNOME 45 desktop my search for help brought me here.

I have the exact same problem, it seems that the steamwebhelper is stuck in an endless loop. I've tried to start the steam client with various relevant options/parameters, the problem remains.

I can exit the steam client through the tray icon, though.

Steam autoupdated itself at around 05:00pm CET yesterday, that's when the problem occured the first time.

Edit: Filesystem is ext4, AMD ATI Radeon R7 Graphics, Kernel driver in use: amdgpu, Distro: Solus 4.5 GNOME

glabifrons commented 9 months ago

@glabifrons - thank you for your detailed report. I assume the crash issue does not go away if you pass -cef-disable-gpu on the command line?

@TTimo Thank you for taking a look. Unfortunately, that doesn't seem to make a difference. Here are the crash IDs from launching with that command line option: CrashID=bp-eef6aa82-944f-4e9c-a63a-986402240229 CrashID=bp-4466f2d5-4310-419d-9b29-d808f2240229

FWIW: I launched it before testing the command line option, saw the crash dialog, and waited while watching top. I saw a fair bit of activity so I left it run for a while. Eventually I saw it download something "Downloading update (67,470 of 67,470 KB)..." and watched it process that (python3 hit the top processes for a bit at this point IIRC). Once it settled down and no more logs were being written, I selected the "Exit Steam" dialog on the crash screen and clicked "OK". Upon restarting it again (without the command line option), I saw it unpack and install the update, but still went to the same crash screen (I was hoping). I waited again for it to settle down, but it was much quicker this time. This is when I tried your suggested command line option.

@glabifrons - it's probably a good idea to test latest without NFS in the loop. You may have a different issue with the new chromium setup but we'd need to make sure it's not something due to your NFS configuration first.

Yes, I actually managed this before and ended up reverting to a kind of hybrid config to be able to use Steam. I reconfigured the links in the ~/.steam directory to point to an admittedly kludgey subdirectory under my cache drive, launched Steam to let it install, shut it down, then replaced the steamapps subdirectory with a link pointing back to ~/.local/share/Steam/steamapps (everything under my home directory will be over NFS) and was able to run Steam and launch games. So yes, it appears to either be related to NFS, or something related to it, but only for the Steam UI, as I have no problem running games, even under Proton from NFS (I played Space Engineers a bit last night this way).

At one point (mentioned in the discussion above somewhere) I noted that I'm using NFSv4. I thought I'd point that out in case it was relevant, as NFSv4 and NFSv3 have different capabilities. That appears to have been ruled out though, as another person mentioned using NFSv4 successfully. Another person with this problem turned out not to be running his home on NFS, but on an odd stack of VFS layers (I couldn't follow his description, but it involved btrfs, which I've never used).

Edit: While I'm configured for NFSv4, I just checked the output of mount(1) and it shows that NFSv4.1 has been negotiated. I don't know if that should make a difference.

glabifrons commented 9 months ago

To everyone else joining this issue thread: Please indicate (most important) what filesystem your home directory is using, which video card brand you're using, and which video card drivers you are using. Additionally, what distro you are using.

My home directory is mounted via NFSv4 (v4.1 according to the output of the mount command) from a Solaris server. My video card is Nvidia and my drivers are "nvidia-driver-545" from Ubuntu's repository. My distro is Ubuntu MATE LTS 22.04 (Ubuntu LTS 22.04 with the MATE desktop environment pre-installed).

smcv commented 9 months ago

There are several situations that can cause the steamwebhelper not to work: the "Steamwebhelper is not responding" menu is a symptom, not a cause. It might be better if we retitle this issue to be specific to @glabifrons' situation (it seems that having Steam installed on NFS rather than on a local filesystem is now understood to be a key part of the problem for them), and ask anyone else with similar symptoms to take them to a separate issue.

Anyone else who is experiencing the "Steamwebhelper is not responding" dialog, please check the Steam logs for more information (Flatpak users: ~/.var/app/com.valvesoftware.Steam/.local/share/Steam/logs, Snap users: ~/snap/steam/common/.local/share/Steam/logs, everyone else: ~/.steam/root/logs). steamwebhelper.log is likely to be the most important log file.

If your symptoms and logs match symptoms and logs seen by other users, you can subscribe to an existing bug report without adding comments by using the "Notifications" panel.


A brief summary of some common known issues that are not in-scope for this issue report:

One common issue at the moment is that it can cause problems if ~/.config, ~/.cache or ~/.local/share are symbolic links (known to be a problem for ~/.config, speculated to be a problem for the others). This is #10547 and #10552, and is out-of-scope for this specific issue thread. The workaround is to use bind-mounts instead of symbolic links. More details: https://github.com/ValveSoftware/steam-for-linux/issues/10547#issuecomment-1972901294. This can affect the steamwebhelper even if Proton 5.13+ and the various Steam Linux Runtime compatibility tools have worked correctly for you in the past.

Another common issue is that if bubblewrap doesn't work on your system, then neither will steamwebhelper, Proton 5.13+ or the various Steam Linux Runtime compatibility tools (but if Proton 5.13+ and Steam Linux Runtime have worked for you in the past, then this particular thing is not the problem). The required kernel functionality is similar to what's needed by Flatpak (see https://github.com/flatpak/flatpak/wiki/User-namespace-requirements#unprivileged-bubblewrap). On systems where bubblewrap needs to be setuid root (https://github.com/flatpak/flatpak/wiki/User-namespace-requirements#setuid-bubblewrap), in principle the container runtime framework is meant to work, but currently doesn't (https://github.com/ValveSoftware/steam-runtime/issues/650). Again, that's out-of-scope for this specific issue thread.

rustysack commented 9 months ago

i have this symptom as well.

i am using nvidia, proprietary driver version 535.154.05. my home directory is an nfs mount, nfs v4.0. i am using gentoo linux, where i've used steam successfully for many years.

it was suggested to install the steam client locally, which i am able to do (i have space available), however it's not clear how to specify a custom installation directory. i've googled around with no satisfaction. i can see in bin_steam.sh that STEAM_DATA_HOME is set to $HOME/.local/share. i haven't tried modifying the script (yet) as it looks like $HOME is called out in several places.

rustysack commented 9 months ago

i have temporarily created a local home directory instead of using the nfs mounted location. steam works now, so appears to be something related to having steam installed on an nfs mount.

is it possible to select where to install the steam client?

davispuh commented 9 months ago

is it possible to select where to install the steam client?

Try symlinking ~/.local/share/Steam to /mnt/Steam, that's what I do (other reasons not because NFS) and it works fine.

Note you might need to delete ~/.steam folder so it's recreated.

davispuh commented 9 months ago

Actually it might be that issue is ~/.config being on NFS and not due to Steam itself. So maybe try bind mounting that to local filesystem.

glabifrons commented 9 months ago

@davispuh

Actually it might be that issue is ~/.config being on NFS and not due to Steam itself. So maybe try bind mounting that to local filesystem.

It's not the .config directory. My ~/.config/ is on NFS still with my workaround described (above and) below, and it works. The only thing that must be on a local filesystem appears to be Steam itself (not the steamapps subdirectory, it can be on NFS). My ~/.steam/ directory was modified as shown:

$ ls -ln ~/.steam/
total 17
lrwxrwxrwx 1 1000 1000   7 Mar  1 20:50 bin -> ./bin32/
lrwxrwxrwx 1 1000 1000  36 Feb 29 23:53 bin32 -> /var/cache/fscache/Steam/ubuntu12_32/
lrwxrwxrwx 1 1000 1000  36 Feb 29 23:53 bin64 -> /var/cache/fscache/Steam/ubuntu12_64/
-rwxrwxr-x 1 1000 1000 659 Mar  1 20:49 registry.vdf*
lrwxrwxrwx 1 1000 1000  24 Feb 29 23:53 root -> /var/cache/fscache/Steam/
lrwxrwxrwx 1 1000 1000  32 Feb 29 23:53 sdk32 -> /var/cache/fscache/Steam/linux32/
lrwxrwxrwx 1 1000 1000  32 Feb 29 23:53 sdk64 -> /var/cache/fscache/Steam/linux64/
lrwxrwxrwx 1 1000 1000  24 Feb 29 23:53 steam -> /var/cache/fscache/Steam/
-rw-rw-r-- 1 1000 1000   8 Feb 29 23:53 steam.pid
prw------- 1 1000 1000   0 Jan 27 13:05 steam.pipe|
-r-------- 1 1000 1000  16 Feb 29 23:53 steam.token

I created the directory /var/cache/fscache/Steam/ and changed the owner and group to my user (fscache is the mount point for my NFS cache, but this does not interfere with it). I then launched steam and let it install into the new directory. After it finished, I shut it down, then renamed the new steamapps directory and created a link to replace it that pointed to the existing games in my NFS mounted home:

$ ls -ln /var/cache/fscache/Steam/steamapps
lrwxrwxrwx 1 1000 1000 42 Feb 27 19:20 /var/cache/fscache/Steam/steamapps -> /home/glabifrons/.local/share/Steam/steamapps/

So Steam seems perfectly fine with the steamapps subdirectory being replaced with a link as a workaround (if, like me, you want to keep your games NFS mounted). I realize there is a configuration change you can make somewhere in Steam to point to a different directory for game installation, but I don't want to reinstall a couple terabytes of games (between me and my family, my kid has over 1TB alone).