ValveSoftware / steam-for-linux

Issue tracking for the Steam for Linux beta client
4.23k stars 174 forks source link

CPU ISA level is lower than required aka Steamwebhelper is not responding on Solus #10556

Open mazirah opened 8 months ago

mazirah commented 8 months ago

Your system information

System Details Report


Report details

Hardware Information:

Software Information:

Error on startup:

steam.sh[42145]: Running Steam on solus 4.5 64-bit
steam.sh[42145]: STEAM_RUNTIME is enabled by the user
setup.sh[42212]: Steam runtime environment up-to-date!
steam.sh[42145]: Steam client's requirements are satisfied
tid(42270) burning pthread_key_t == 0 so we never use it
[2024-02-29 20:19:53] Startup - updater built Feb 29 2024 00:39:10
[2024-02-29 20:19:53] Startup - Steam Client launched with: '/home/mazirah/.local/share/Steam/ubuntu12_32/steam'
02/29 20:19:53 Init: Installing breakpad exception handler for appid(steam)/version(1709168962)/tid(42270)
[2024-02-29 20:19:53] Loading cached metrics from disk (/home/mazirah/.local/share/Steam/package/steam_client_metrics.bin)
[2024-02-29 20:19:53] Failed to load cached hosts file (File 'update_hosts_cached.vdf' not found), using defaults
[2024-02-29 20:19:53] Using the following download hosts for Public, Realm steamglobal
[2024-02-29 20:19:53] 1. https://cdn.steamstatic.com, /client/, Realm 'steamglobal', weight was 1, source = 'baked in'
[2024-02-29 20:19:53] Verifying installation...
[2024-02-29 20:19:53] Verification complete
UpdateUI: skip show logo
Steam logging initialized: directory: /home/mazirah/.local/share/Steam/logs

/home/mazirah/.themes/Adwaita-dark/gtk-2.0/main.rc:733: error: unexpected identifier 'direction', expected character '}'
/home/mazirah/.themes/Adwaita-dark/gtk-2.0/hacks.rc:28: error: invalid string constant "normal_entry", expected valid string constant
XRRGetOutputInfo Workaround: initialized with override: 0 real: 0xf1632e00
XRRGetCrtcInfo Workaround: initialized with override: 0 real: 0xf1631680
steamwebhelper.sh[42283]: === Thu Feb 29 08:19:54 PM EET 2024 ===
steamwebhelper.sh[42283]: Starting steamwebhelper under bootstrap sniper steam runtime at /home/mazirah/.local/share/Steam/ubuntu12_64/steam-runtime-sniper
CAppInfoCacheReadFromDiskThread took 0 milliseconds to initialize
Steam Runtime Launch Service: starting steam-runtime-launcher-service
Steam Runtime Launch Service: steam-runtime-launcher-service is running pid 42362
bus_name=com.steampowered.PressureVessel.LaunchAlongsideSteam
steamwebhelper.sh[42494]: === Thu Feb 29 08:20:04 PM EET 2024 ===
steamwebhelper.sh[42494]: Starting steamwebhelper under bootstrap sniper steam runtime at /home/mazirah/.local/share/Steam/ubuntu12_64/steam-runtime-sniper
steamwebhelper.sh[42636]: === Thu Feb 29 08:20:14 PM EET 2024 ===
steamwebhelper.sh[42636]: Starting steamwebhelper under bootstrap sniper steam runtime at /home/mazirah/.local/share/Steam/ubuntu12_64/steam-runtime-sniper
src/steamUI/steamuisharedjscontroller.cpp (546) : Failed creating offscreen shared JS context
src/steamUI/steamuisharedjscontroller.cpp (546) : Failed creating offscreen shared JS context
02/29 20:20:16 Init: Installing breakpad exception handler for appid(steam)/version(1709168962)/tid(42270)
assert_20240229202016_26.dmp[42741]: Uploading dump (out-of-process)
/tmp/dumps/assert_20240229202016_26.dmp
assert_20240229202016_26.dmp[42741]: Finished uploading minidump (out-of-process): success = yes
assert_20240229202016_26.dmp[42741]: response: CrashID=bp-e746162a-612a-4d4d-825c-9a6922240229
assert_20240229202016_26.dmp[42741]: file ''/tmp/dumps/assert_20240229202016_26.dmp'', upload yes: ''CrashID=bp-e746162a-612a-4d4d-825c-9a6922240229''
[2024-02-29 20:20:25] Shutdown

Here are the contents of steamwebhelper.log

steamwebhelper.sh[32731]: === Thu Feb 29 07:34:03 PM EET 2024 ===
steamwebhelper.sh[32731]: Starting steamwebhelper under bootstrap sniper steam runtime at /home/mazirah/.local/share/Steam/ubuntu12_>
/usr/lib/pressure-vessel/overrides/lib/x86_64-linux-gnu/libc.so.6: CPU ISA level is lower than required
  1. Start steam
smcv commented 8 months ago

Please try running Steam with STEAM_LINUX_RUNTIME_VERBOSE=1 in the environment (for example STEAM_LINUX_RUNTIME_VERBOSE=1 steam), and collect the resulting steamwebhelper.log (it will be much larger).

mazirah commented 8 months ago

I've ran the verbose command, here are the new logs: steam-logs.tar.gz I've updated the issue as well.

smcv commented 8 months ago

This is happening because Solus installs multiple builds of the same libraries, some compiled for a CPU newer than yours and some not. This is unusual: most distributions target a single baseline architecture, don't support anything older at all, and don't build any libraries that require anything newer either. Clear Linux is the other distribution most likely to be affected.

Normally, the libraries that require a newer CPU are automatically skipped, but the Steam Linux Runtime container framework does not know how to do that, and is using the first implementation that it finds for each library. Unfortunately, in your case, the first implementation it finds is the one for x86_64 v3 (approximately Intel Haswell or later, circa 2013), and your CPU is older than that, so the x86_64 v3 libraries won't work.

A temporary workaround might be to move /usr/lib64/glibc-hwcaps/x86-64-v3 out of the way on affected systems.

According to https://discuss.getsol.us/d/10152-solus-5-and-x86-64-v3-target, at some point in the future ("When we rebase off of Serpent"), Solus is going to switch to a different approach which will probably avoid this issue as a side-effect.

ermo commented 8 months ago

Normally, the libraries that require a newer CPU are automatically skipped, but the Steam Linux Runtime container framework does not know how to do that, and is using the first implementation that it finds for each library.

@smcv ... but we (Solus) are just using bog standard glibc hardware caps? Wouldn't it be best if the Steam Linux Runtime container framework was taught about those sooner rather than later...?

If I'm missing something obvious here, please feel free to enlighten me -- always happy to learn more.

smcv commented 8 months ago

Wouldn't it be best if the Steam Linux Runtime container framework was taught about those sooner rather than later...?

Of course it would, but writing code takes longer than triaging issue reports.

ermo commented 8 months ago

Wouldn't it be best if the Steam Linux Runtime container framework was taught about those sooner rather than later...?

@smcv : Of course it would, but writing code takes longer than triaging issue reports.

Would you be open to a PR...?

cc. @ikeycode

smcv commented 8 months ago

Would you be open to a PR...?

Sure. We can't technically accept merge requests at the moment, because the relevant code is on a Gitlab instance that is not open to external users, but the next best thing is to have a branch of the same repository hosted in some public location (Github, gitlab.com, anywhere else suitable) and give us a reference that we can git fetch for review.

I think what's needed is that search_ldcache_cb() in https://gitlab.collabora.com/vivek/libcapsule/-/blob/master/utils/ld-libs.c?ref_type=heads needs to be taught to match libraries with non-trivial hwcaps against the CPU's actual capabilities, and skip libraries where the hwcaps are too high. https://gitlab.collabora.com/vivek/libcapsule/-/blob/master/utils/ld-cache.c?ref_type=heads might also be relevant.

(The production version of this code as used in the Steam Linux Runtime is vendored into https://gitlab.steamos.cloud/steamrt/steam-runtime-tools, but libcapsule is its canonical upstream location.)

smcv commented 8 months ago

Tracked as steamrt/tasks#410 internally

7heo commented 7 months ago

A temporary workaround might be to move /usr/lib64/glibc-hwcaps/x86-64-v3 out of the way on affected systems.

~I can confirm that sudo mv /usr/lib64/glibc-hwcaps/x86-64-v3 /usr/lib64/glibc-hwcaps/x86-64-v3.disabled worked on an affected system.~

A few days and a reboot later, this quick fix failed to solve the problem, and I was back to square one. However, after a little bit of head scratching, I figured that, one way or another, some binaries (probably the steam ones), also requiring AVX2 AFAICT, still took precedence over the system ones. After checking the link https://discuss.getsol.us/d/10152-solus-5-and-x86-64-v3-target from the messages above, I noticed that it was possible to run ld.so --help to see the list of supported glibc-hwcaps.

So here is the solution I came up with:

  1. sudo mkdir -p /usr/lib64/glibc-hwcaps/x86-64-v2/engines-3/
  2. sudo mkdir /usr/lib64/glibc-hwcaps/x86-64-v2/ossl-modules/
  3. find /usr/lib64/glibc-hwcaps/x86-64-v3 -type l -exec sudo cp -P {} /usr/lib64/glibc-hwcaps/x86-64-v2/ \;
  4. Link the missing "files" from /usr/lib64

The result should be:

/usr/lib64/glibc-hwcaps/x86-64-v2/
├── engines-3
│   ├── afalg.so -> /usr/lib64/engines-3/afalg.so
│   ├── capi.so -> /usr/lib64/engines-3/capi.so
│   ├── loader_attic.so -> /usr/lib64/engines-3/loader_attic.so
│   └── padlock.so -> /usr/lib64/engines-3/padlock.so
├── libaom.so.3 -> libaom.so.3.8.2
├── libaom.so.3.8.2 -> /usr/lib64/libaom.so.3.8.2
├── libcrypto.so.3 -> /usr/lib64/libcrypto.so.3
├── libcrypt.so.1 -> libcrypt.so.1.1.0
├── libcrypt.so.1.1.0 -> /usr/lib64/libcrypt.so.1.1.0
├── libcrypt.so.2 -> libcrypt.so.2.0.0
├── libcrypt.so.2.0.0 -> /usr/lib64/libcrypt.so.2.0.0
├── libc.so.6 -> /usr/lib64/libc.so.6
├── libdav1d.so.7 -> libdav1d.so.7.0.0
├── libdav1d.so.7.0.0 -> /usr/lib64/libdav1d.so.7.0.0
├── libfftw3f_omp.so.3 -> libfftw3f_omp.so.3.6.10
├── libfftw3f_omp.so.3.6.10 -> /usr/lib64/libfftw3f_omp.so.3.6.10
├── libfftw3f.so.3 -> /usr/lib64/libfftw3f.so.3
├── libfftw3f.so.3.6.10 -> /usr/lib64/libfftw3f.so.3.6.10
├── libfftw3f_threads.so.3 -> libfftw3f_threads.so.3.6.10
├── libfftw3f_threads.so.3.6.10 -> /usr/lib64/libfftw3f_threads.so.3.6.10
├── libfftw3_omp.so.3 -> libfftw3_omp.so.3.6.10
├── libfftw3_omp.so.3.6.10 -> /usr/lib64/libfftw3_omp.so.3.6.10
├── libfftw3.so.3 -> libfftw3.so.3.6.10
├── libfftw3.so.3.6.10 -> /usr/lib64/libfftw3.so.3.6.10
├── libfftw3_threads.so.3 -> libfftw3_threads.so.3.6.10
├── libfftw3_threads.so.3.6.10 -> /usr/lib64/libfftw3_threads.so.3.6.10
├── libFLAC.so.12 -> libFLAC.so.12.1.0
├── libFLAC.so.12.1.0 -> /usr/lib64/libFLAC.so.12.1.0
├── libgraphene-1.0.so.0 -> libgraphene-1.0.so.0.1000.8
├── libgraphene-1.0.so.0.1000.8 -> /usr/lib64/libgraphene-1.0.so.0.1000.8
├── libm.so.6 -> /usr/lib64/libm.so.6
├── libmvec.so.1 -> /usr/lib64/libmvec.so.1
├── libpng16.so.16 -> libpng16.so.16.43.0
├── libpng16.so.16.43.0 -> /usr/lib64/libpng16.so.16.43.0
├── libraw_r.so.23 -> libraw_r.so.23.0.0
├── libraw_r.so.23.0.0 -> /usr/lib64/libraw_r.so.23.0.0
├── libraw.so.23 -> libraw.so.23.0.0
├── libraw.so.23.0.0 -> /usr/lib64/libraw.so.23.0.0
├── libssl.so.3 -> /usr/lib64/libssl.so.3
├── libvpx.so.8 -> libvpx.so.8.0.1
├── libvpx.so.8.0 -> libvpx.so.8.0.1
├── libvpx.so.8.0.1 -> /usr/lib64/libvpx.so.8.0.1
├── libwebp.so.7 -> libwebp.so.7.1.8
├── libwebp.so.7.1.8 -> /usr/lib64/libwebp.so.7.1.8
├── libz.so.1 -> libz.so.1.3.1
├── libz.so.1.3.1 -> /usr/lib64/libz.so.1.3.1
└── ossl-modules
    └── legacy.so -> /usr/lib64/ossl-modules/legacy.so

After that, steam starts.

ReillyBrogan commented 4 months ago

For anyone having this issue on Solus, please give the below a try:

  1. Create the file /etc/environment if it does not already exist and add the following to it:
    GLIBC_TUNABLES=glibc.cpu.hwcaps=-AVX
  2. Reboot
  3. Test to see if things work.

Functionally this environmental variable configures the glibc dynamic loader to ignore all libraries that are built with AVX support (basically everything in the glibc-hwcaps directory) which should fix this issue if Steam is parsing the ld.so output to determine what libraries to pull into the container.

Note that it may also slow down other applications on the system if they do CPU-level feature detection in their code itself (mostly media and crypto libs) though this is probably not going to be to any noticeable degree.

smcv commented 4 months ago

It would be useful information if someone tries that, but I'll warn you now that the workaround in the previous comment probably isn't going to work, because the container runtime infrastructure involves parsing the binary ld.so.cache directly.

The older LD_LIBRARY_PATH runtime did work by screen-scraping ldconfig output, and that would maybe have taken into account GLIBC_TUNABLES (?), but we were never very happy about that, because getting machine-readable information out of human-readable diagnostic output is really fragile.

I am not aware of anything in Steam that parses ld.so output, but perhaps you meant ldconfig anyway?

smcv commented 4 months ago

I still think the only reliable answer to this is going to be https://github.com/ValveSoftware/steam-for-linux/issues/10556#issuecomment-1973169201. One of my colleagues has it on his list, but it's a long list.

Zockopa commented 4 months ago

Well,personaly i find i rather frustrating that since months nothing changed for the common user. I mean the feb`24 update of the Steam client software caused this. Before all was just fine. But now those who are not privi to fiddle with systemfiles and such are just left in the desert.

ReillyBrogan commented 3 months ago

I am not aware of anything in Steam that parses ld.so output, but perhaps you meant ldconfig anyway?

I did mean that, I was not aware that it parsed ld.so.cache directly I thought it parsed CLI output. I had hoped that it would still work assuming ldconfig respected the tunable when generating the cache but after some testing it doesn't look like it does.

Anyway, as a workaround I added a patch to our glibc package (see here) which will cause ldconfig to skip checking hwcaps directories if the STEAM_HACK_IGNORE_HWCAPS environmental variable is defined. I verified that it worked and that the ld cache no longer contained any reference to said libs, only referencing the base ones.

After the next sync (this Friday or so) you should be able to do the following to get Steam working again:

  1. echo "STEAM_HACK_IGNORE_HWCAPS=1" | sudo tee /etc/environment
  2. Log out or reboot so that that environmental variable is activated
  3. Run sudo ldconfig -X
ReillyBrogan commented 3 months ago

FYI this update is live for stable users and we've seen confirmation from users that it works

smcv commented 3 months ago

I had hoped that it would still work assuming ldconfig respected the tunable when generating the cache but after some testing it doesn't look like it does.

I believe ldconfig ignores the current CPU and GLIBC_TUNABLES when populating the cache, and instead enters each copy of each library into the cache, along with its required hwcaps. It's ld.so that is responsible for matching the current CPU (and maybe GLIBC_TUNABLES) against the hwcaps, and disregarding libraries that are listed in the cache as requiring a newer CPU than the one we're actually running on. If it didn't work that way, you wouldn't be able to install a system on a newer CPU, and then boot it on an older CPU (for example for disaster-recovery purposes).

The bug is that the code in libcapsule that parses the cache does not take the "required hwcaps" field into account, and instead assumes that all libraries are OK. In most distros this is true (because most distros don't have hwcaps-gated libraries that require extremely new CPUs), but on Solus it is not.

I still think https://github.com/ValveSoftware/steam-for-linux/issues/10556#issuecomment-1973169201 is the correct long-term solution, it is still on my colleague's to-do list, and it is still a sufficiently long list that I cannot predict when or whether it will happen.

Zockopa commented 3 months ago

After the next sync (this Friday or so) you should be able to do the following to get Steam working again:

  1. echo "STEAM_HACK_IGNORE_HWCAPS=1" | sudo tee /etc/environment
  2. Log out or reboot so that that environmental variable is activated
  3. Run sudo ldconfig -X

Thanks,works like nicely.