ValveSoftware / steam-for-linux

Issue tracking for the Steam for Linux beta client
4.22k stars 174 forks source link

"Steamwebhelper is not responding" seen with slow disk I/O #10581

Open Slabity opened 7 months ago

Slabity commented 7 months ago

Your system information

Please describe your issue in as much detail as possible:

I have completely removed .steam and .local/share/Steam, fully updated my system, and even tried the Flatpak version. But whenever I try to run Steam I get the following:

steamissue

I've cleared out /tmp and .cache as well in case there was some issue there, but that did not resolve the problem.

It looks like there's a lot of these "Steamwebhelper is not responding" issues at the moment across various distributions over the past week, but they all seem to have different causes so I'm not sure if this a duplicate of any of them.

EDIT: Fixed logs

Slabity commented 7 months ago

Another odd thing is that when I clear out the installation and try to run it again (Flatpak version this time), the error message shows up before the Linux runtime finishes unpacking.

steamissue2

Here's the set of logs from the Flatpak version in case it's helpful: steam-logs.tar.gz

EDIT: Fixed the logs because the command in the template doesn't seem to actually work.

smcv commented 7 months ago

the error message shows up before the Linux runtime finishes unpacking

If it's taking a long(ish) time to unpack the runtime, then this is probably just the Steam client being impatient and not waiting long enough for the steamwebhelper to be available.

smcv commented 7 months ago

It looks like there's a lot of these "Steamwebhelper is not responding" issues at the moment across various distributions over the past week, but they all seem to have different causes so I'm not sure if this a duplicate of any of them.

Yes, it's not clear. Thank you for reporting this separately and not assuming that your root cause is the same as someone else's.

One thing to check is that you don't have a /tmp/dumps owned by some other uid (#10549).

Another is to check that your ~/.config is not a symbolic link (#10547).

From your Flatpak log, I don't recognise these messages, so they might be distinctive/unusal/relevant:

[0304/200629.793041:WARNING:value_store_frontend.cc(47)] Reading mhjfbmdgcfjbbpaeojofohoefgiehjai.alarms from  failed: IO error: .../LOCK: File currently in use. (ChromeMethodBFE: 15::LockFile::2)
g_dbus_connection_real_closed: Remote peer vanished with error: Underlying GIOStream returned 0 bytes on an async read (g-io-error-quark, 0). Exiting.

The last one might perhaps indicate a problem with D-Bus?

smcv commented 7 months ago

Unfortunately I don't see anything particularly interesting or distinctive in the non-Flatpak set of logs.

If Steam uploaded any crash reports (you would see CrashID=something in its output on the terminal), it could be helpful to copy/paste the CrashID here so that a Valve developer can look up details of the the crash.

Slabity commented 7 months ago

One thing to check is that you don't have a /tmp/dumps owned by some other uid (https://github.com/ValveSoftware/steam-for-linux/issues/10549).

Yea, I cleared out all of /tmp and can confirm that /tmp/dumps (and all its contents) is owned by my user.

Another is to check that your ~/.config is not a symbolic link (https://github.com/ValveSoftware/steam-for-linux/issues/10547).

I have some symlinks within ~/.config, but the folder itself is normal.

From your Flatpak log, I don't recognise these messages, so they might be distinctive/unusal/relevant:

Actually that might be an artifact from when I force-restarted it to generate new logs. Here's what happens after a fresh delete of .var/app/com.valvesoftware.Steam and then running flatpak run com.valvesoftware.Steam:

steamwebhelper.sh[5294]: === Mon Mar  4 08:45:59 PM America 2024 ===
steamwebhelper.sh[5294]: Starting steamwebhelper under bootstrap sniper steam runtime at /home/slabity/.var/app/com.valvesoftware.Steam/.local/share/Steam/ubuntu12_64/steam-runtime-sniper
steam-runtime-sniper.sh[5294]: Extracting /home/slabity/.var/app/com.valvesoftware.Steam/.local/share/Steam/ubuntu12_64/steam-runtime-sniper.tar.xz to /home/slabity/.var/app/com.valvesoftware.Steam/.local/share/Steam/ubuntu12_64/steam-runtime-sniper.new
/bin/bash: /usr/lib/pressure-vessel/overrides/lib/x86_64-linux-gnu/libtinfo.so.6: no version information available (required by /bin/bash)
exec ./steamwebhelper --no-sandbox -lang=en_US -cachedir=/home/slabity/.var/app/com.valvesoftware.Steam/.local/share/Steam/config/htmlcache -steampid=5261 -buildid=1709168962 -steamid=0 -logdir=/home/slabity/.var/app/com.valvesoftware.Steam/.local/share/Steam/logs -uimode=7 -startcount=0 -steamuniverse=Public -realm=Global -clientui=/home/slabity/.var/app/com.valvesoftware.Steam/.local/share/Steam/clientui -steampath=/home/slabity/.var/app/com.valvesoftware.Steam/.local/share/Steam/ubuntu12_32/steam -launcher=0 -no-restart-on-ui-mode-change --enable-smooth-scrolling --no-sandbox --password-store=basic --log-file=/home/slabity/.var/app/com.valvesoftware.Steam/.local/share/Steam/logs/cef_log.txt --disable-quick-menu --disable-features=DcheckIsFatal
[0304/204720.759081:ERROR:context.cc(100)] The browser_subprocess_path directory (./steamwebhelper) is not an absolute path. Defaulting to empty.
[0304/204720.815339:WARNING:crash_reporting.cc(278)] Failed to set crash key: UserID with value: 0
[0304/204720.815381:WARNING:crash_reporting.cc(278)] Failed to set crash key: BuildID with value: 1709167136
[0304/204720.815384:WARNING:crash_reporting.cc(278)] Failed to set crash key: SteamUniverse with value: Public
[0304/204720.815387:WARNING:crash_reporting.cc(278)] Failed to set crash key: Vendor with value: Valve
[0304/204720.815390:WARNING:crash_reporting.cc(278)] Failed to set crash key: Platform with value: Linux
[0304/204720.815807:INFO:crash_reporting.cc(239)] Crash reporting enabled for process: browser
[0304/204720.816760:WARNING:task_impl.cc(32)] No task runner for threadId 0
[0304/204720.817952:WARNING:task_impl.cc(32)] No task runner for threadId 0
[0304/204720.844473:WARNING:crash_reporting.cc(278)] Failed to set crash key: UserID with value: 0
[0304/204720.844523:WARNING:crash_reporting.cc(278)] Failed to set crash key: BuildID with value: 1709168962
[0304/204720.844526:WARNING:crash_reporting.cc(278)] Failed to set crash key: SteamUniverse with value: Public
[0304/204720.844529:WARNING:crash_reporting.cc(278)] Failed to set crash key: Vendor with value: Valve
[0304/204720.844532:WARNING:crash_reporting.cc(278)] Failed to set crash key: Platform with value: Linux
[0304/204720.844882:WARNING:crash_reporting.cc(278)] Failed to set crash key: UserID with value: 0
[0304/204720.844930:WARNING:crash_reporting.cc(278)] Failed to set crash key: BuildID with value: 1709168962
[0304/204720.844933:WARNING:crash_reporting.cc(278)] Failed to set crash key: SteamUniverse with value: Public
[0304/204720.844937:WARNING:crash_reporting.cc(278)] Failed to set crash key: Vendor with value: Valve
[0304/204720.844940:WARNING:crash_reporting.cc(278)] Failed to set crash key: Platform with value: Linux
[0304/204721.109423:INFO:crash_reporting.cc(262)] Crash reporting enabled for process: gpu-process
[0304/204721.176100:WARNING:sandbox_linux.cc(385)] InitializeSandbox() called with multiple threads in process gpu-process.
[0304/204721.294223:WARNING:crash_reporting.cc(278)] Failed to set crash key: UserID with value: 0
[0304/204721.294269:WARNING:crash_reporting.cc(278)] Failed to set crash key: BuildID with value: 1709168962
[0304/204721.294272:WARNING:crash_reporting.cc(278)] Failed to set crash key: SteamUniverse with value: Public
[0304/204721.294275:WARNING:crash_reporting.cc(278)] Failed to set crash key: Vendor with value: Valve
[0304/204721.294278:WARNING:crash_reporting.cc(278)] Failed to set crash key: Platform with value: Linux
[0304/204721.295057:INFO:crash_reporting.cc(239)] Crash reporting enabled for process: utility

If Steam uploaded any crash reports (you would see CrashID=something in its output on the terminal), it could be helpful to copy/paste the CrashID here so that a Valve developer can look up details of the the crash.

assert_20240304204625_30.dmp[5494]: response: CrashID=bp-1e6bc8f1-6853-4858-b632-158862240304

Slabity commented 7 months ago

the error message shows up before the Linux runtime finishes unpacking

If it's taking a long(ish) time to unpack the runtime, then this is probably just the Steam client being impatient and not waiting long enough for the steamwebhelper to be available.

Okay, this is interesting because this gave me an idea to mount a tmpfs at /home/slabity/.local/share/Steam and have Steam try to install it in memory.

And it worked

So I think there's a newly introduced race condition somewhere in the new update and that it's not able to load the runtime fast enough before Steam decides to throw an error condition?

I think I need to figure out why my home disk is going slow (3 NVMe drives in RAID0 should not be slow). But also I think something is wrong with the new update because I was not having this issue before Feb 29th.

Slabity commented 7 months ago

Okay, I think I found the issue if anyone else is running into a similar problem:

My filesystem was going too slow to unpack the Steam runtime at a reasonable speed (in my case, I was using bcachefs and there was a performance issue that is now fixed in kernel 6.8).

If you are running into this problem and the "Steamwebhelper is not responding" window comes up just before the runtime unpacking message finishes, then try the following:

  1. Backup and move $HOME/.local/share/Steam somewhere safe.
  2. Mount a tmpfs with the following: sudo mount -t tmpfs -o size=8G tmpfs $HOME/.local/share/Steam
  3. Ensure your user is the owner of that folder (and not root): sudo chown $(whoami):users $HOME/.local/share/Steam
  4. Try to start Steam and see if it succeeds

If it succeeds, then the issue is likely a problem with your filesystem or storage going too slowly and Steam sees that as an issue.

@smcv - Thanks for the help. It looks like a lot of people have been having similar errors recently, so it might be worth it for someone at Valve to check for some sort of race condition there? At the very least the above test might help some people if the logs aren't being too helpful.

EDIT: Also I can use the Beta now, which was not possible for the past few months as I was getting the same error.

smcv commented 7 months ago

My filesystem was going too slow to unpack the Steam runtime at a reasonable speed (in my case, I was using bcachefs and there was a performance issue that is now fixed in kernel 6.8).

Oh, if it's just being slow and the Steam UI eventually starts successfully, then you could ignore the "Steamwebhelper is not responding" dialog and (eventually) continue to use Steam.

It looks like a lot of people have been having similar errors recently

Most of the other recent issue reports about this have been about things that either cause the steamwebhelper to crash immediately after it starts, or cause the container runtime to fail to start at all (in which case we never actually get far enough to run the steamwebhelper). Steam automatically restarts the steamwebhelper, but if it crashes every time, then it will just restart in a loop; the result is that same dialog, but with no other UI. That's considerably worse than it just being slow to start. I had misread your issue report as being the same symptom.

some sort of race condition

I think you might be assuming this is more complicated than it actually is. As far as I know, Steam is not aware of whether progress is being made in starting the steamwebhelper, it just has a timeout for "surely it should have worked by now?" after which it pops up this dialog. The choice of how long to wait is essentially arbitrary, and there's no ideal answer: if it's too short, then people whose systems are just being slow (like you) will see it when perhaps you shouldn't, but if it's too long, then people with a more serious issue that blocks use of Steam altogether (like the steamwebhelper crashing repeatedly) will not get any feedback for a while.

The container runtime and the steamwebhelper are both rather complicated, but the mechanism for detecting that there's a problem with it, not so much :-)

Slabity commented 7 months ago

Oh, if it's just being slow and the Steam UI eventually starts successfully, then you could ignore the "Steamwebhelper is not responding" dialog and (eventually) continue to use Steam.

The issue is that it didn't start eventually. The error message came up but Steam itself never gets past that point. Even after over an hour of leaving it that way.

One thing I didn't mention was that Steam did successfully start up once or twice out of the couple of dozen attempts I made (and was very quick), but certain things were missing like the inability to load my Friends List, switch out of offline mode, sync my cloud data, and my profile in the top-right was just a question mark. I probably should have mentioned these symptoms, but that was before I cleared out my Steam install and tried to go fresh and minimize the scope of the problem. I'm not experiencing those issues anymore either now.

Also, all of those symptoms started with the Beta a little over a month ago for me (I think end of January). I left the Beta and that fixed the problem temporarily, but with the new update on Feb 29th the problem came back.

I think you might be assuming this is more complicated than it actually is.

Sorry, I meant "race condition" in the general sense of Steam checking something before it's ready and then returning a fail state (the timeout you mention). Probably not the best use of the word, but your description was my intention.

Anyway, while the performance issue on my system was definitely causing (or at least revealing the issue), what concerns me is that the performance wasn't actually that bad in my opinion for Steam to have crashed. I regularly had >1GB/s read/write speeds and responsiveness was pretty good despite the kernel issue. And the "Unpacking Steam Linux Runtime container" was probably only ~2x faster after the fix, which in terms of SSD speed is not relatively significant.

But in any case, I don't think there's much more useful information that I can provide. I definitely don't think all or even most reported crashes are due to the same problem, but it looks like some reported crashes (namely #10431) might be having a similar underlying issue if switching the install location to a different/local/faster drive fixed their problem.

Thank you for your help debugging this. Your comment implying there's a timeout was probably the only indication of the underlying issue. If there's any further information that I can provide, let me know.

radiant-knight commented 7 months ago

Happened to me on Gentoo running steam under swaywm, but works fine on Plasma wayland

smcv commented 7 months ago

@radiant-knight:

Happened to me on Gentoo running steam under swaywm, but works fine on Plasma wayland

There are lots of reasons why steamwebhelper might crash or not work, which all have the same user-visible symptom, but for different reasons. Unless you have specific evidence, we can't assume that you are seeing this for the same reason as the original reporter of this particular issue. Given your mention of this not working in Sway but working in a more integrated environment, my first guess would be #10554.

Or, if it's not that, then please check the Steam logs for more information (Flatpak users: ~/.var/app/com.valvesoftware.Steam/.local/share/Steam/logs, Snap users: ~/snap/steam/common/.local/share/Steam/logs, everyone else: ~/.steam/root/logs). steamwebhelper.log is likely to be the most important log file.

If you think you might be encountering a different root cause for this crash, please report a new issue with full details.

If your symptoms and logs match symptoms and logs seen by other users, you can subscribe to an existing bug report without adding comments by using the "Notifications" panel. You can help Steam developers to solve problems like this more quickly by only adding comments if they have new information that could help to find a solution.

manujchandra commented 6 months ago

Steam was working fine a couple of days ago. Now this is happening on Debian 12 (Flatpak steam):

Screenshot_2024-04-11_07-54-31

smcv commented 6 months ago

Now this is happening

A picture of an error message is not enough information to be useful to a Steam or Steam Runtime developer. Please collect logs as described in https://github.com/ValveSoftware/steam-for-linux/issues/10581#issuecomment-1980795479, and if they do not match any existing issue, open a separate issue.

manujchandra commented 6 months ago

The contents of steamwebhelper.log are as follows:

steamwebhelper.sh[281]: === Saturday 13 April 2024 10:38:40 AM IST === steamwebhelper.sh[281]: Starting steamwebhelper under bootstrap sniper steam runtime at /home/manuj/.var/app/com.valvesoftware.Steam/.local/share/Steam/ubuntu12_64/steam-runtime-sniper pressure-vessel-wrap[301]: E: Could not create copy "./lib/python3.9/concurrent/__pycache__/__init__.cpython-39.pyc" from "/home/manuj/.var/app/com.valvesoftware.Steam/.local/share/Steam/ubuntu12_64/steam-runtime-sniper/sniper_platform_0.20240307.80401/files/./lib/python3.9/concurrent/__pycache__/__init__.cpython-39.pyc" into "/home/manuj/.var/app/com.valvesoftware.Steam/.local/share/Steam/ubuntu12_64/steam-runtime-sniper/var/tmp-9SG9L2/usr": fstatat(./lib/python3.9/concurrent/__pycache__/__init__.cpython-39.pyc): No such file or directory

Slabity commented 6 months ago

@manujchandra - Just to confirm, the issue I reported here is due to the 2024 Feb 29th Steam update being unable to deal with slow FS/HDD/SSD IO. Please confirm using the steps here to determine if your issue is identical to the one I reported: https://github.com/ValveSoftware/steam-for-linux/issues/10581#issuecomment-1977511684

@smcv - Can you please confirm if the issue related to slow FS/HDD/SSD IO has been identified and resolved?

I want to be very clear; There was a bug/regression in the 2024 Feb 29th Steam update that has caused this issue to pop up on my end. The fact that a kernel update improved I/O performance for my specific filesystem does not mean this issue was fixed and it needs to be identified upstream in the Steam subsystems.

This is an issue in Steam regardless of whether a change in one of the kernel's filesystems fixed it on my end. I believe Steam should still be able to function in an environment where disk speed is slower than expected.

manujchandra commented 5 months ago

I uninstalled steam and deleted the flatpak folder. After reinstalling, Steam seems to be working again. I did not try the ramdisk method though.

smcv commented 5 months ago

The contents of steamwebhelper.log are as follows:

steamwebhelper.sh[281]: === Saturday 13 April 2024 10:38:40 AM IST === steamwebhelper.sh[281]: Starting steamwebhelper under bootstrap sniper steam runtime at /home/manuj/.var/app/com.valvesoftware.Steam/.local/share/Steam/ubuntu12_64/steam-runtime-sniper pressure-vessel-wrap[301]: E: Could not create copy "./lib/python3.9/concurrent/__pycache__/__init__.cpython-39.pyc" from "/home/manuj/.var/app/com.valvesoftware.Steam/.local/share/Steam/ubuntu12_64/steam-runtime-sniper/sniper_platform_0.20240307.80401/files/./lib/python3.9/concurrent/__pycache__/__init__.cpython-39.pyc" into "/home/manuj/.var/app/com.valvesoftware.Steam/.local/share/Steam/ubuntu12_64/steam-runtime-sniper/var/tmp-9SG9L2/usr": fstatat(./lib/python3.9/concurrent/__pycache__/__init__.cpython-39.pyc): No such file or directory

This looks like a duplicate of #10614, and not the same I/O-performance-related issue that others on this issue have been talking about. A mitigation for #10614 is in progress and is likely to be in a future beta.

smcv commented 5 months ago

Can you please confirm if the issue related to slow FS/HDD/SSD IO has been identified and resolved?

To the best of my knowledge, there has not been any specific root cause identified or solution found for the steamwebhelper failing to start on systems with slow I/O. But I am not able to diagnose or fix anything inside the Steam client itself, only the Steam Runtime - I do not have any more access to the proprietary internals of the Steam client than you do!

@Slabity or @kisak-valve, please could you retitle this issue to indicate that its scope is specifically systems with slow I/O, so that we don't get the issue mixed up with any other reason why steamwebhelper might fail to start? Every time one of these issues gets entangled with someone else seeing similar symptoms for a different reason, it slows down the process of finding a solution for any of the issues that lead to similar symptoms.

This is an issue in Steam regardless of whether a change in one of the kernel's filesystems fixed it on my end. I believe Steam should still be able to function in an environment where disk speed is slower than expected.

I'm sure, but please don't shoot the messenger. Knowing that there is some sort of problem involving slow I/O doesn't mean that anyone can identify a root cause.

As far as I know, the message "steamwebhelper is not responding" appears after an arbitrary timeout has elapsed with no communication from the steamwebhelper, so if your system's I/O is so slow that it takes longer than the arbitrary timeout to start up the Steam Linux Runtime container, that message is always going to appear.

smcv commented 5 months ago

while the performance issue on my system was definitely causing (or at least revealing the issue), what concerns me is that the performance wasn't actually that bad in my opinion for Steam to have crashed

It's worth mentioning that there's performance and there's performance. Many people only measure disk performance in terms of throughput (best-case sequential I/O bandwidth), but bringing up the Steam Linux Runtime container involves quite a lot of metadata operations (hard links, file deletion, setting permissions), and those can have very different performance characteristics - for those, random-access I/O latency is going to be the limiting factor. This is particularly noticeable on NFS and similar networked filesystems, which often have surprisingly good throughput but bad latency. I could imagine that some on-disk filesystems would have similar characteristics, although hopefully to a less dramatic extent than with NFS.

Slabity commented 5 months ago

I'm sure, but please don't shoot the messenger. Knowing that there is some sort of problem involving slow I/O doesn't mean that anyone can identify a root cause.

I apologize, reading my message back a few days later, my response came off as a lot more aggressive than I intended. The wording and emphasis was to imply that I believed there was still an issue that was not yet solved and I didn't want it to seem that my own resolution to the problem meant that the root cause no longer exists.

To be clear, I am literally using the unstable version of a niche Linux distro installed on top of a newly released filesystem as a part of the kernel's non-stable release-candidate branches. I am very comfortable with things breaking on my end.

It's worth mentioning that there's performance and there's performance. Many people only measure disk performance in terms of throughput (best-case sequential I/O bandwidth), but bringing up the Steam Linux Runtime container involves quite a lot of metadata operations (hard links, file deletion, setting permissions), and those can have very different performance characteristics - for those, random-access I/O latency is going to be the limiting factor. This is particularly noticeable on NFS and similar networked filesystems, which often have surprisingly good throughput but bad latency. I could imagine that some on-disk filesystems would have similar characteristics, although hopefully to a less dramatic extent than with NFS.

I can confirm that I was getting very good sequential read/write speeds even with the bcachefs performance issues in effect. I tested that immediately after you mentioned the possibility of Steam not waiting long enough for the steamwebhelper process to fully come up and was measuring over 3gbps on the 3x RAID0 NVMe drives that I was using. So you're definitely correct about it being a metadata (or at least non-sequential IO) issue.

I still have a snapshot of my system running version 6.6 that struggles with Steam starting up. I'm not sure if I can provide any further detailed information than the logs I originally posted, but if there's anything I can test to help determine the underlying cause then let me know. The only thing I know for certain is that there was no issue on the stable Steam release prior to the Feb 29th release.