ValveSoftware / steam-for-linux

Issue tracking for the Steam for Linux beta client
4.25k stars 175 forks source link

Steam often crashes due to OOM logged during update checks (tries to allocate 1.9 GB memory) #7630

Closed kakra closed 1 year ago

kakra commented 3 years ago

Your system information

Please describe your issue in as much detail as possible:

Describe what you expected should happen and what did happen. Please link any large code pastes as a Github Gist

Lately, Steam often crashes, either while gaming (which quits the game with it), or just when left idle in the background. This is what is logged to /tmp:

Fri Jan 29 13:10:01 2021 GMT: file ''/tmp/dumps/crash_20210129140958_91.dmp'', upload yes: ''CrashID=bp-ed57e904-4818-4c88-85e0-91ff72210129''
Fri Jan 29 13:12:07 2021 GMT: file ''/tmp/dumps/assert_20210129141204_16.dmp'', upload yes: ''Discarded=1''
Fri Jan 29 17:13:12 2021 GMT: file ''/tmp/dumps/crash_20210129181310_71.dmp'', upload yes: ''CrashID=bp-9f4d04f2-eeb6-4361-9271-d225a2210129''
Fri Jan 29 18:13:01 2021 GMT: file ''/tmp/dumps/assert_20210129191259_70.dmp'', upload yes: ''CrashID=bp-46d5814f-3d38-48ad-b5eb-56f4e2210129''
Fri Jan 29 18:25:10 2021 GMT: file ''/tmp/dumps/crash_20210129192508_2.dmp'', upload yes: ''Discarded=1''
Fri Jan 29 19:00:27 2021 GMT: file ''/tmp/dumps/assert_20210129200025_51.dmp'', upload yes: ''CrashID=bp-66d0e741-750c-4d1e-b695-dff342210129''
Fri Jan 29 21:06:10 2021 GMT: file ''/tmp/dumps/crash_20210129220608_49.dmp'', upload yes: ''CrashID=bp-df5d4d44-2d0c-4c9b-9593-d40612210129''
Fri Jan 29 23:25:55 2021 GMT: file ''/tmp/dumps/assert_20210130002553_54.dmp'', upload yes: ''CrashID=bp-1a55ed37-27e5-4f02-b1cd-355022210129''
Sat Jan 30 00:57:21 2021 GMT: file ''/tmp/dumps/assert_20210130015719_88.dmp'', upload yes: ''CrashID=bp-e2ab92a3-f63c-41f2-b196-c8a1f2210129''
Sat Jan 30 01:00:55 2021 GMT: file ''/tmp/dumps/assert_20210130020053_16.dmp'', upload yes: ''Discarded=1''

The dumps usually end with this line when running through strings:

Assert( fatal stalled cross-thread pipe (pipe is disconnected). ):/data/src/common/pipes.cpp:837

Nothing is logged to dmesg or journald which would explain this. System memory usage is at 50% at most (12-16 GB of 32 GB).

Note: The crash dumps above may actually reference different crash situation as outlined below (section "differentiation").

Steps for reproducing this issue:

  1. Start the Steam client
  2. Optional: Play a game
  3. Wait some time
  4. Steam crashes, and if a game is running, it also instantly dies
  5. The log shows a failed allocation size with a distinct size of 1970089747 bytes

Differentiation from other issues:

based on https://github.com/ValveSoftware/steam-for-linux/issues/7630#issuecomment-827555620 and previous comments

kakra commented 3 years ago

Happens also with the beta (2021-01-30) just released:

Sat Jan 30 04:14:03 2021 GMT: file ''/tmp/dumps/assert_20210130051402_23.dmp'', upload yes: ''CrashID=bp-bfc1c8d1-fb34-400e-a20a-4d6c12210129''
Sat Jan 30 05:59:18 2021 GMT: file ''/tmp/dumps/assert_20210130065916_227.dmp'', upload yes: ''CrashID=bp-45e82d4b-51b8-433c-ac14-e794f2210129''

But: 23: Assert( Assertion Failed: vecDBs.Count() > 0 ):/data/src/clientdll/shadercachemanager/shadercompilejob.cpp:116 227: Assert( Fatal Assertion Failed: OUT OF MEMORY ):/data/src/tier0/memstd.cpp:2487

I'm not sure why it shows "out of memory": According to my atop history, there were 20+ GB of memory available at the time of the crash:

6:50 - 7:50: MEM | tot 31.3G | free 3.5G | cache 20.0G | dirty 76.2M | buff 1.2M | slab 1.0G | slrec 544.3M | | shmem 1.2G | shrss 60.1M | shswp 0.0M | | vmbal 0.0M | | hptot 0.0M | hpuse 0.0M |

7:00 - 7:10: MEM | tot 31.3G | free 3.8G | cache 20.3G | dirty 74.7M | buff 1.2M | slab 980.8M | slrec 541.8M | | shmem 701.6M | shrss 60.1M | shswp 0.0M | | vmbal 0.0M | | hptot 0.0M | hpuse 0.0M |

It seems to crash for seemingly random reasons. I've seen the "out of memory" line before and there was always plenty of memory available.

kakra commented 3 years ago

Since fossilize finished processing all games in the beta, the crashes seem to be gone. So the crashes may either depend on the shader caches, or on the system load.

kakra commented 3 years ago

So now it did crash again:

Thu Feb  4 09:22:28 2021 GMT: file ''/tmp/dumps/assert_20210204102226_41.dmp'', upload yes: ''CrashID=bp-91955c29-02e4-4880-bcd0-5789a2210204''
Assert( Fatal Assertion Failed: OUT OF MEMORY ):/data/src/tier0/memstd.cpp:2487

I'm pretty sure there's no out of memory condition, even with Steam running there's 21 GB memory available (8 GB free/unused). Any way for me to debug this further?

raQai commented 3 years ago

Any update on this? This is still a thing.

build: 2021-03-23 system: Manjaro 21.0 Ornara tested kernels: 5.9.16-1, 5.10.23-1, 5.11.6-1

kakra commented 3 years ago

@raQai It's rare currently but still happens...

@kisak-valve It seems to also affect Manjaro, so it's probably not Gentoo-specific.

raQai commented 3 years ago

@kakra thanks for the info.

do you by any chance run docker containers on your system which you do not shut down? I did some testing and log analysis yesterday and it seems like there have been some interferences with my docker containers and Steam. After shutting down all containers, Steam did not crash for like 3 hours and I no longer got OOM exceptions. Will keep testing.

kakra commented 3 years ago

I'm not using docker because it messes with the network interfaces. Instead, I've setup a MACVLAN bridge interface by which the host system connects to the network (so the physical interface doesn't have an IP but the MACVLAN bridge has). Also attached to the MACVLAN bridge I have some systemd-nspawn containers running for development (autostarted on boot because they do not use many resources), and my qemu VM are attached to this bridge (started on demand). This allows me to operate containers and VMs without NAT through the host, and without needing a switch that supports hairpinning. I also have two OpenVPN interfaces running on this machine to connect to two networks I'm managing, these are always connected but have only specific subnet and host routes, no default route.

I never had problems with this setup. I know that many people have problems with docker and VPN network interfaces in Proton games.

FTR, I also tried many things and thought "oh cool, the problem seems fixed by that" only to discover that after the next Steam reboot the problem returned. If you'd ask me, I'd say the containers have nothing to do with that, stability seems mostly random - sometimes it works, and after the next reboot it doesn't.

If looking at RAM usage, the only thing I see is multiple Chrome processes allocated 33 GB virtual RAM (I know that this is not used RAM just address space but - I think - it puts pressure on the TLB [translation lookaside buffer], the actual used RAM of the whole Chrome process tree is much lower, below 2 GB, according to kernel memory accounting which includes file system cache usage:

# systemctl status --user app-google\\x2dchrome-18f0eaf0ceb0486ea0e67eef32f89931.scope
● app-google\x2dchrome-18f0eaf0ceb0486ea0e67eef32f89931.scope - Google Chrome - Web-Browser
     Loaded: loaded (/usr/share/applications/google-chrome.desktop; transient)
  Transient: yes
     Active: active (running) since Sat 2021-04-03 00:35:05 CEST; 10h ago
      Tasks: 360 (limit: 38367)
     Memory: 1.7G
        CPU: 0
     CGroup: /user.slice/user-500.slice/user@500.service/app.slice/app-google\x2dchrome-18f0eaf0ceb0486ea0e67eef32f89931.scope

Steam is comparable when idle (or even while doing its fossilize thing which does a great job after implementing the suggestions the maintainer and I developed during a long discussion):

# systemctl status --user app-steam\\x2dsilent-autostart.service
● app-steam\x2dsilent-autostart.service - Steam Autostart
     Loaded: loaded (/home/kakra/.config/autostart/steam-silent.desktop; generated)
     Active: active (running) since Sat 2021-04-03 00:34:47 CEST; 10h ago
       Docs: man:systemd-xdg-autostart-generator(8)
   Main PID: 2324 (bash)
      Tasks: 103 (limit: 38367)
     Memory: 1.3G
        CPU: 0
     CGroup: /user.slice/user-500.slice/user@500.service/app.slice/app-steam\x2dsilent-autostart.service
             ├─ 2324 bash /home/kakra/.local/share/Steam/steam.sh -silent
             ├─ 3526 /home/kakra/.local/share/Steam/ubuntu12_32/steam -silent
             ├─ 3609 /home/kakra/.local/share/Steam/ubuntu12_32/steam -silent
             ├─ 3610 /bin/bash /home/kakra/.local/share/Steam/ubuntu12_64/steamwebhelper.sh -lang=de_DE -cachedir=/home/kakra/.local/share/Steam/config/htmlcache -steampid=3526 -buildid=1617402021 -steamid=0 -cachedir=/home/kakra/.local/share/Steam/confi>
             ├─ 3612 ./steamwebhelper -lang=de_DE -cachedir=/home/kakra/.local/share/Steam/config/htmlcache -steampid=3526 -buildid=1617402021 -steamid=0 -cachedir=/home/kakra/.local/share/Steam/config/htmlcache -steamuniverse=Public -realm=Global -clien>
             ├─ 3637 /home/kakra/.local/share/Steam/ubuntu12_64/steamwebhelper --type=zygote --no-sandbox --log-file=/home/kakra/.local/share/Steam/logs/cef_log.txt --product-version=Valve Steam Client --lang=en_US.UTF-8
             ├─ 3664 /home/kakra/.local/share/Steam/ubuntu12_64/steamwebhelper --type=gpu-process --field-trial-handle=12209813370393101961,62 s=MimeHandlerViewInCrossProcessFrame --no-sandbox --log-file=/home/kakra/.local/share/Steam/logs/cef_log.txt -->
             ├─ 3711 /home/kakra/.local/share/Steam/ubuntu12_64/steamwebhelper --type=utility --field-trial-handle=12209813370393101961,6215161123523965996,1310 ssProcessFrame --lang=de --service-sandbox-type=network --no-sandbox --log-file=/home/kakra/.>
             ├─ 4154 /home/kakra/.local/share/Steam/ubuntu12_64/steamwebhelper --type=renderer --no-sandbox --log-file=/home/kakra/.local/share/Steam/logs/cef_log.txt --field-trial-handle=12209813370393101961,6215161123523965996,131072 --disable-features>
             ├─ 4189 /home/kakra/.local/share/Steam/ubuntu12_64/steamwebhelper --type=renderer --no-sandbox --log-file=/home/kakra/.local/share/Steam/logs/cef_log.txt --field-trial-handle=12209813370393101961,6215161123523965996,131072 --disable-features>
             ├─ 4195 /home/kakra/.local/share/Steam/ubuntu12_64/steamwebhelper --type=renderer --no-sandbox --log-file=/home/kakra/.local/share/Steam/logs/cef_log.txt --field-trial-handle=12209813370393101961,6215161123523965996,131072 --disable-features>
             ├─ 7392 /home/kakra/.local/share/Steam/ubuntu12_32/../ubuntu12_64/fossilize_replay /home/kakra/.local/share/Steam/steamapps/shadercache/1222140/fozpipelinesv5/steamapp_pipeline_cache.foz /home/kakra/.local/share/Steam/steamapps/shadercache/1>
             ├─ 8519 /home/kakra/.local/share/Steam/ubuntu12_32/../ubuntu12_64/fossilize_replay /home/kakra/.local/share/Steam/steamapps/shadercache/1222140/fozpipelinesv5/steamapp_pipeline_cache.foz /home/kakra/.local/share/Steam/steamapps/shadercache/1>
             └─10139 /home/kakra/.local/share/Steam/ubuntu12_32/../ubuntu12_64/fossilize_replay /home/kakra/.local/share/Steam/steamapps/shadercache/1222140/fozpipelinesv5/steamapp_pipeline_cache.foz /home/kakra/.local/share/Steam/steamapps/shadercache/1>

The virtual machines are barely noticeable here:

systemctl status machine.slice
● machine.slice - Virtual Machine and Container Slice
     Loaded: loaded (/usr/lib/systemd/system/machine.slice; static)
    Drop-In: /etc/systemd/system/machine.slice.d
             └─override.conf
     Active: active since Sat 2021-04-03 00:34:29 CEST; 10h ago
       Docs: man:systemd.special(7)
         IO: 449.5M read, 859.9M written
      Tasks: 151
     Memory: 366.5M
        CPU: 0
     CGroup: /machine.slice
...

Note to above: CPU is accounted with 0 usage due to using a CK patched kernel which only has a fake CPU cgroup controller.

Networking info (network C is my local network and has the default route):

# networkctl status
●          State: routable
         Address: 192.168.AAA.AA on vpn-MANAGED-A
                  192.168.BBB.B on vpn-MANAGED-B
                  192.168.C.CCC on vmbridge
                  2a02:CCCC:CCC:CCCC::CCC on vmbridge
                  2a02:CCCC:CCC:CCCC:CCCC:CCCC:CCCC:CCCC on vmbridge
                  fe80::AAAA:AAAA:AAAA:AAAA on vpn-MANAGED-A
                  fe80::BBBB:BBBB:BBBB:BBBB on vpn-MANAGED-B
                  fe80::CCCC:CCCC:CCCC:CCCC on vmbridge
         Gateway: 192.168.C.C on vmbridge
                  fe80::CCCC:CCCC:CCCC:CCCC on vmbridge
             DNS: 10.AAA.AA.A
                  10.AAA.AA.AA
                  192.CCC.C.C
                  192.BBB.BB.B
                  192.BBB.BB.B
  Search Domains: sol.LOCAL-C.de
   Route Domains: MANAGED-A.de
                  MANAGED-A.local
                  aaaaaaaa.MANAGED-A.de
                  .
                  MANAGED-B.local

The problem seems to occur less often since I opt-ed back into the Steam beta client.

My system currently sees some problems which look like workqueue congestions in the kernel which causes some microspikes in overall system latency but only if I leave the system running for multiple days (I've already upstreamed a kernel patch for bcache which mitigates this issue, backported to kernels 5.4+): The mouse pointer would micro-block sometimes for a few milliseconds which is barely noticeable. It can only be observed because moving the mouse pointer around to target small desktop elements sometimes puts the mouse pointer into an unexpected nearby spot. When this happens, it's usually in bursts so I if I'd move the mouse pointer in circles, it would stop for a few milliseconds, disturbing the circle movement. Not sure how to describe that better. But what I could conclude from that is that concurrent threads in Steam may get into an unexpected race condition because threads start to get out of expected order. If that would be the case, the Steam client would clearly lack some proper internal locking and synchronization. OTOH, the observation seems to be purely coincidental as I cannot reliably reproduce that. Probably similar to your observation about having docker running.

And TBH, stopping docker to properly run Steam is NOT a solution. If there'd be any bad interaction between those two, this needs to be fixed on either side.

Looking at this problem for quite some time now, it looks like a race condition inside the Steam client to me because the observations and proposed solutions seem purely random. Such behavior is usually a timing or concurrency thing when using threads and workers.

So one of the remaining questions is: Do you use bcache, or btrfs? Do you see micro latency spikes while moving the mouse cursor around? Do you see messages about pending workqueue jobs in dmesg? Do you see OOM messages in dmesg or do you run earlyoom/systemd-oomd? Whatever the underlying problem is that triggers those crashes, it's not a problem in that layer, the Steam client should simply not silently crash but handle that gracefully.

BTW: I just installed earlyoom lately to see if there might be any connection but there are no OOM events logged - so this is not a problem. The only thing I see is the micro latency spikes which I'm not sure how to analyze or trace back to the origin. Nothing in the system seems to cause it as I stopped many services and tasks while seeing them, so it's probably something that piles up in the kernel over time. But maybe we both can find a common denominator in our systems.

raQai commented 3 years ago

Not using bcache or btrfs but I indeed do have the described issues with the latency spikes with my mouse (probably the full system). As you mentioned, it is barely noticable in daily use but becomes really annoying in games. Then again once steam is shut down (or crashes silently) everything is back up to normal.

The problem seems to occur less often since I opt-ed back into the Steam beta client.

I had the feeling it got worse after trying the beta client for a few hours (at least it crashed more often). Still I also noticed less micro lags in games using the beta. Using either release, even native games like L4D seem to have micro lag spikes even though my avg FPS is locked at 300.

And TBH, stopping docker to properly run Steam is NOT a solution. If there'd be any bad interaction between those two, this needs to be fixed on either side.

Totally aggree on this one.

If you need any information about my system to find any kind of correlation, just tell me the commands and I provide the entire output. It seems like you understand far more than I do regarding this subject.

kakra commented 3 years ago

Here's my current Steam info posted to gist.github.com: https://gist.github.com/kakra/a92cdfc0ed8cc212ae981a7d9518b5a6 (taken from the Steam system information menu).

Not using bcache or btrfs but I indeed do have the described issues with the latency spikes with my mouse (probably the full system). As you mentioned, it is barely noticable in daily use but becomes really annoying in games. Then again once steam is shut down (or crashes silently) everything is back up to normal.

Well, for me the micro lags / latency spikes do not come from Steam running (or at least not alone), they persist even when I closed it, only a reboot helps. I think there's currently something going on in the kernel that introduced that. Let's hope some of the core kernel developers have the same experience and will come up with a fix. Since I've got the bcache patches into the kernel, latency spikes in games are mostly gone. But I still experience input lagging visibly (and persistently) behind sometimes, almost 200ms or more. Feels like playing drunken. ;-) I don't think we can attribute that to Steam although it started logging a lot of stuff in loop since some weeks (which may contribute to this problem). Could you look at journalctl and see if Steam repeats the same logs over and over again, sometimes multiple times per second? Then maybe head over to https://github.com/ValveSoftware/steam-for-linux/issues/7734.

What you are seeing may come from the fossilize process of Steam: Did you check if it is running when you're seeing the micro lags? OTOH, fossilize will shut down within a few seconds when you start a game, so the issue should not persist through playing a game then.

So we are still at a theory that these micro latency spikes could cause the crashes. But those are not caused by Steam but probably come from the kernel.

raQai commented 3 years ago

Here is my steam info: https://gist.github.com/raQai/37a4554130250b9be9dab9e1be824e9f

Checking for fossilize the next days. Posting an update if i find anything new.

raQai commented 3 years ago

Steam still crashes while opt in to beta. Found the following messages in my journal right before the crash occurs.

The following set of messages occur during micro stuttering.

Apr 05 13:08:39 rq-workstation steam.desktop[340327]: assert_20210405125100_1.dmp[340327]: Uploading dump (out-of-process)
Apr 05 13:08:39 rq-workstation steam.desktop[340327]: /tmp/dumps/assert_20210405125100_1.dmp
Apr 05 13:08:39 rq-workstation assert_20210405125100_1.dmp[340327]: Uploading dump (out-of-process)
                                                                    /tmp/dumps/assert_20210405125100_1.dmp
Apr 05 13:08:39 rq-workstation audit: BPF prog-id=34 op=LOAD
Apr 05 13:08:39 rq-workstation audit: BPF prog-id=35 op=LOAD
Apr 05 13:08:39 rq-workstation systemd[1]: Started Process Core Dump (PID 340328/UID 0).
Apr 05 13:08:39 rq-workstation audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-coredump@8-340328-0 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Apr 05 13:08:39 rq-workstation audit[250023]: NETFILTER_CFG table=filter family=2 entries=4 op=xt_unregister pid=250023 comm="kworker/u64:3"
Apr 05 13:08:39 rq-workstation audit[250023]: NETFILTER_CFG table=nat family=2 entries=15 op=xt_unregister pid=250023 comm="kworker/u64:3"
Apr 05 13:08:40 rq-workstation steam.desktop[340327]: assert_20210405125100_1.dmp[340327]: Finished uploading minidump (out-of-process): success = yes
Apr 05 13:08:40 rq-workstation steam.desktop[340327]: assert_20210405125100_1.dmp[340327]: response: CrashID=bp-a97af391-3ccf-4cf2-865e-46b282210405
Apr 05 13:08:40 rq-workstation steam.desktop[340327]: assert_20210405125100_1.dmp[340327]: file ''/tmp/dumps/assert_20210405125100_1.dmp'', upload yes: ''CrashID=bp-a97af391-3ccf-4cf2-865e-46b282210405''
Apr 05 13:08:40 rq-workstation assert_20210405125100_1.dmp[340327]: Finished uploading minidump (out-of-process): success = yes
Apr 05 13:08:40 rq-workstation assert_20210405125100_1.dmp[340327]: response: CrashID=bp-a97af391-3ccf-4cf2-865e-46b282210405
Apr 05 13:08:40 rq-workstation assert_20210405125100_1.dmp[340327]: file ''/tmp/dumps/assert_20210405125100_1.dmp'', upload yes: ''CrashID=bp-a97af391-3ccf-4cf2-865e-46b282210405''
Apr 05 13:08:40 rq-workstation systemd-coredump[340330]: Process 294173 (steam) of user 1000 dumped core.

                                                         Stack trace of thread 294173:
                                                         #0  0x00000000f7993fea __strlen_sse2_bsf (libc.so.6 + 0x91fea)
                                                         #1  0x00000000ed20f9c3 n/a (/home/raqai/.local/share/Steam/ubuntu12_32/steamclient.so + 0x6279c3)
Apr 05 13:08:41 rq-workstation systemd[1]: systemd-coredump@8-340328-0.service: Succeeded.
Apr 05 13:08:41 rq-workstation audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-coredump@8-340328-0 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Apr 05 13:08:41 rq-workstation steam.desktop[294015]: /home/raqai/.local/share/Steam/steam.sh: line 771: 294173 Segmentation fault      (core dumped) $STEAM_DEBUGGER "$STEAMROOT/$STEAMEXEPATH" "$@"

The following entries occur after crashing

Apr 05 12:28:39 rq-workstation steam.desktop[184149]: CCrossProcessPipe::BWrite wrote too few bytes: 32 (Broken pipe).  Continuing.
Apr 05 12:28:39 rq-workstation steam.desktop[184149]: src/common/pipes.cpp (837) : fatal stalled cross-thread pipe (pipe is disconnected).
Apr 05 12:28:39 rq-workstation steam.desktop[184149]: src/common/pipes.cpp (837) : fatal stalled cross-thread pipe (pipe is disconnected).
Apr 05 12:28:39 rq-workstation steam.desktop[184149]: src/common/pipes.cpp (837) : Fatal assert; application exiting
Apr 05 12:28:39 rq-workstation steam.desktop[184149]: src/common/pipes.cpp (837) : Fatal assert; application exiting
Apr 05 12:28:40 rq-workstation steam.desktop[241572]: ERROR: ld.so: object '/home/raqai/.local/share/Steam/ubuntu12_64/gameoverlayrenderer.so' from LD_PRELOAD cannot be preloaded (wrong ELF class: ELFCLASS64): ignored.
Apr 05 12:28:40 rq-workstation steam.desktop[241573]: assert_20210405122839_2.dmp[241573]: Uploading dump (out-of-process)
Apr 05 12:28:40 rq-workstation steam.desktop[241573]: /tmp/dumps/assert_20210405122839_2.dmp
Apr 05 12:28:40 rq-workstation steam.desktop[184149]: _ExitOnFatalAssert
Apr 05 12:28:42 rq-workstation steam.desktop[241573]: assert_20210405122839_2.dmp[241573]: Finished uploading minidump (out-of-process): success = yes
Apr 05 12:28:42 rq-workstation steam.desktop[241573]: assert_20210405122839_2.dmp[241573]: response: Discarded=1
Apr 05 12:28:42 rq-workstation steam.desktop[241573]: assert_20210405122839_2.dmp[241573]: file ''/tmp/dumps/assert_20210405122839_2.dmp'', upload yes: ''Discarded=1''
Apr 05 12:28:42 rq-workstation steam.desktop[241573]: pid 241573 != 241572, skipping destruction (fork without exec?)

@kakra do you think the dmp file would provide additional information? Also do you think this relates to your issue?

kakra commented 3 years ago

You could run the dump file through the strings command to extract all human-readable data from the dump file.

raQai commented 3 years ago

https://gist.github.com/raQai/2e6caa5c932ba42046866379f9d8474b

kakra commented 3 years ago

Did Steam crash, or did a game crash? It looks like this is a Proton 6.5 GE crash...

raQai commented 3 years ago

steam crashed. this crash was during a gaming session though and the game crashed with it. Also I get the same log output when using Proton 6.3-1

kakra commented 3 years ago

Yeah, I think when Steam crashes, the game will just be killed with it - probably due to a "broken pipe". So maybe you picked the wrong dump and there should be another one which is actually from the Steam process?

raQai commented 3 years ago

Hmmm, do you see anything of interest for this or any other issue in this journalctl extract? https://gist.github.com/raQai/2653bf1a6bf2207952eaf97e50cd9841

I do not see anything that might me directly linked to the steam crash.

Fogapod commented 3 years ago

I've been getting these too for a past few months I think, recently these crashes became more frequent. I'm using Arch Linux. Typical log after crash that happens randomly: either while playing killing game too (any game) or just silently die while in background:

[0405/195204.609450:INFO:crash_reporting.cc(270)] Crash reporting enabled for process: renderer
[0405/195204.635024:INFO:crash_reporting.cc(270)] Crash reporting enabled for process: renderer
[0405/195204.653993:INFO:crash_reporting.cc(270)] Crash reporting enabled for process: renderer
Installing breakpad exception handler for appid(steam)/version(1616532526)
Failed to init SteamVR because it isn't installed
ExecCommandLine: "'/home/eugene/.local/share/Steam/ubuntu12_32/steam'"
System startup time: 1.06 seconds
[0405/195204.949379:INFO:crash_reporting.cc(270)] Crash reporting enabled for process: renderer
[0405/195204.951054:INFO:crash_reporting.cc(270)] Crash reporting enabled for process: renderer
[0405/195204.954806:INFO:crash_reporting.cc(270)] Crash reporting enabled for process: renderer
BuildCompleteAppOverviewChange: 136
RegisterForAppOverview 1: 4ms
RegisterForAppOverview 2: 4ms
Installing breakpad exception handler for appid(steam)/version(1616532526)
Installing breakpad exception handler for appid(steam)/version(1616532526)
Installing breakpad exception handler for appid(steam)/version(1616532526)
***** OUT OF MEMORY! attempted allocation size: 1970089747 ****
src/tier0/memstd.cpp (2489) : Fatal Assertion Failed: OUT OF MEMORY
src/tier0/memstd.cpp (2489) : Fatal Assertion Failed: OUT OF MEMORY
src/tier0/memstd.cpp (2489) : Fatal assert; application exiting
src/tier0/memstd.cpp (2489) : Fatal assert; application exiting
Installing breakpad exception handler for appid(steam)/version(1616532526)
assert_20210405202126_26.dmp[458641]: Uploading dump (out-of-process)
/tmp/dumps/assert_20210405202126_26.dmp
_ExitOnFatalAssert
assert_20210405202126_26.dmp[458641]: Finished uploading minidump (out-of-process): success = yes
assert_20210405202126_26.dmp[458641]: response: CrashID=bp-f2878fe1-0353-4b1f-a454-ef0e22210405
assert_20210405202126_26.dmp[458641]: file ''/tmp/dumps/assert_20210405202126_26.dmp'', upload yes: ''CrashID=bp-f2878fe1-0353-4b1f-a454-ef0e22210405''

There is plenty of free memory (32G in total, only around 6-8 is used). I'm using btrfs but I had similar problems on ext4 too. Any info I can provide to help with this issue?

swedneck commented 3 years ago

Pretty certain i'm affected by this as well on fedora workstation 32, happy to provide any information that will help diagnose the issue.

cyberpunkrocker-zero commented 3 years ago

I've also had those random OOM crashes on Arch Linux (steam-native) during the last 1-2 months. Steam client is just silently idling in the background, when it suddenly crashes, today after about 5 hours. So far it hasn't crashed during a game.

Installing breakpad exception handler for appid(steam)/version(1618256785)
[2021-04-24 12:25:17] Background update loop checking for update. . .
[2021-04-24 12:25:17] Downloading manifest: https://cdn.akamai.steamstatic.com/client/steam_client_ubuntu12
[2021-04-24 12:25:17] Download skipped by HTTP 304 Not Modified
[2021-04-24 12:25:17] Nothing to do
Installing breakpad exception handler for appid(steam)/version(1618256785)
Installing breakpad exception handler for appid(steam)/version(1618256785)
***** OUT OF MEMORY! attempted allocation size: 1970089747 ****
src/tier0/memstd.cpp (2489) : Fatal Assertion Failed: OUT OF MEMORY
src/tier0/memstd.cpp (2489) : Fatal Assertion Failed: OUT OF MEMORY
src/tier0/memstd.cpp (2489) : Fatal assert; application exiting
src/tier0/memstd.cpp (2489) : Fatal assert; application exiting
Installing breakpad exception handler for appid(steam)/version(1618256785)
assert_20210424143922_50.dmp[78120]: Uploading dump (out-of-process)
/tmp/dumps/assert_20210424143922_50.dmp
_ExitOnFatalAssert
assert_20210424143922_50.dmp[78120]: Finished uploading minidump (out-of-process): success = yes
assert_20210424143922_50.dmp[78120]: response: CrashID=bp-d7f95581-c5a7-4ddb-bc71-f339b2210424
assert_20210424143922_50.dmp[78120]: file ''/tmp/dumps/assert_20210424143922_50.dmp'', upload yes: ''CrashID=bp-d7f95581-c5a7-4ddb-bc71-f339b2210424''
kakra commented 3 years ago

For me, it's getting really annoying lately: Steam crashes in the middle of a game, and it does it often. I have maybe 30-60 minutes before it crashes. Usually, FPS suddenly drop, and games see short freezes multiple times within a few seconds, then everything crashes back to desktop with Steam gone:

Apr 24 14:57:50 jupiter plasmashell[275010]: Looks like steam didn't shutdown cleanly, scheduling immediate update check
Apr 24 14:57:50 jupiter plasmashell[275010]: [2021-04-24 14:22:01] Loading cached metrics from disk (/home/kakra/.local/share/Steam/package/steam_client_metrics.bin)
Apr 24 14:57:50 jupiter plasmashell[275010]: [2021-04-24 14:22:01] Using the following download hosts for Public, Realm steamglobal
Apr 24 14:57:50 jupiter plasmashell[275010]: [2021-04-24 14:22:01] 1. https://cdn.cloudflare.steamstatic.com, /client/, Realm 'steamglobal', weight was 100, source = 'update_hosts_cached.vdf'
Apr 24 14:57:50 jupiter plasmashell[275010]: [2021-04-24 14:22:01] 2. https://cdn.akamai.steamstatic.com, /client/, Realm 'steamglobal', weight was 100, source = 'update_hosts_cached.vdf'
Apr 24 14:57:50 jupiter plasmashell[275010]: [2021-04-24 14:22:01] 3. http://media.steampowered.com, /client/, Realm 'steamglobal', weight was 1, source = 'baked in'
Apr 24 14:57:50 jupiter plasmashell[275010]: [2021-04-24 14:22:01] Checking for update on startup
Apr 24 14:57:50 jupiter plasmashell[275010]: [2021-04-24 14:22:01] Suche nach verfügbaren Updates...
Apr 24 14:57:50 jupiter plasmashell[275010]: [2021-04-24 14:22:01] Downloading manifest: https://cdn.cloudflare.steamstatic.com/client/steam_client_publicbeta_ubuntu12
Apr 24 14:57:50 jupiter plasmashell[275010]: [2021-04-24 14:22:02] Download skipped by HTTP 304 Not Modified
Apr 24 14:57:50 jupiter plasmashell[275010]: [2021-04-24 14:22:02] Nothing to do
Apr 24 14:57:50 jupiter plasmashell[275010]: [2021-04-24 14:22:02] Installation wird überprüft...
Apr 24 14:57:50 jupiter plasmashell[275010]: [2021-04-24 14:22:02] Performing checksum verification of executable files
Apr 24 14:57:50 jupiter plasmashell[275010]: [2021-04-24 14:22:05] Verification complete
Apr 24 14:57:50 jupiter plasmashell[275010]: ***** OUT OF MEMORY! attempted allocation size: 1970089747 ****
Apr 24 14:57:50 jupiter plasmashell[275010]: src/tier0/memstd.cpp (2489) : Fatal Assertion Failed: OUT OF MEMORY
Apr 24 14:57:50 jupiter plasmashell[275010]: src/tier0/memstd.cpp (2489) : Fatal Assertion Failed: OUT OF MEMORY
Apr 24 14:57:50 jupiter plasmashell[275010]: src/tier0/memstd.cpp (2489) : Fatal assert; application exiting
Apr 24 14:57:50 jupiter plasmashell[275010]: src/tier0/memstd.cpp (2489) : Fatal assert; application exiting
Apr 24 14:57:50 jupiter plasmashell[275010]: Installing breakpad exception handler for appid(steam)/version(1619231536)
Apr 24 14:57:51 jupiter plasmashell[277663]: assert_20210424145750_54.dmp[277663]: Uploading dump (out-of-process)
Apr 24 14:57:51 jupiter plasmashell[277663]: /tmp/dumps/assert_20210424145750_54.dmp
Apr 24 14:57:51 jupiter plasmashell[275010]: _ExitOnFatalAssert
Apr 24 14:57:52 jupiter plasmashell[275019]: Invalid browser dimensions: 0 x 0
Apr 24 14:57:52 jupiter plasmashell[275019]: Invalid browser dimensions: 0 x 4
Apr 24 14:57:52 jupiter plasmashell[275019]: Invalid browser dimensions: 0 x 0
Apr 24 14:57:52 jupiter plasmashell[277663]: assert_20210424145750_54.dmp[277663]: Finished uploading minidump (out-of-process): success = yes
Apr 24 14:57:52 jupiter plasmashell[277663]: assert_20210424145750_54.dmp[277663]: response: CrashID=bp-e47502eb-6c1a-4ecf-8332-fb3ba2210424
Apr 24 14:57:52 jupiter plasmashell[277663]: assert_20210424145750_54.dmp[277663]: file ''/tmp/dumps/assert_20210424145750_54.dmp'', upload yes: ''CrashID=bp-e47502eb-6c1a-4ecf-8332-fb3ba2210424''
Apr 24 14:57:52 jupiter plasmashell[276285]: src/common/pipes.cpp (837) : fatal stalled cross-thread pipe (pipe is disconnected).
Apr 24 14:57:52 jupiter plasmashell[276285]: src/common/pipes.cpp (837) : fatal stalled cross-thread pipe (pipe is disconnected).
Apr 24 14:57:52 jupiter plasmashell[276285]: src/common/pipes.cpp (837) : Fatal assert; application exiting
Apr 24 14:57:52 jupiter plasmashell[276285]: src/common/pipes.cpp (837) : Fatal assert; application exiting
Apr 24 14:57:55 jupiter plasmashell[277676]: ERROR: ld.so: object '/home/kakra/.local/share/Steam/ubuntu12_64/gameoverlayrenderer.so' from LD_PRELOAD cannot be preloaded (wrong ELF class: ELFCLASS64): ignored.
Apr 24 14:57:55 jupiter plasmashell[277677]: crash_20210424145752_2.dmp[277677]: Uploading dump (out-of-process)
Apr 24 14:57:55 jupiter plasmashell[277677]: /tmp/dumps/crash_20210424145752_2.dmp
Apr 24 14:57:55 jupiter plasmashell[276285]: _ExitOnFatalAssert

Why did it try to allocate 1.9 GB of memory anyways? This is highly likely to fail because Steam is a 32 bit app:

# file /home/kakra/.local/share/Steam/ubuntu12_32/steam
/home/kakra/.local/share/Steam/ubuntu12_32/steam: ELF 32-bit LSB shared object, Intel 80386, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux.so.2, for GNU/Linux 2.6.24, BuildID[sha1]=da97d79e349497519d62cf33b7cfaa6aff1857eb, not stripped

Something is going crazy inside Steam. It looks like the update check is running right in the middle of a game session.

kakra commented 3 years ago

It's always the exact same size:

Apr 25 05:27:43 jupiter plasmashell[60187]: ***** OUT OF MEMORY! attempted allocation size: 1970089747 ****
Apr 25 05:27:43 jupiter plasmashell[60187]: src/tier0/memstd.cpp (2489) : Fatal Assertion Failed: OUT OF MEMORY
Apr 25 05:27:43 jupiter plasmashell[60187]: src/tier0/memstd.cpp (2489) : Fatal Assertion Failed: OUT OF MEMORY
Apr 25 05:27:43 jupiter plasmashell[60187]: src/tier0/memstd.cpp (2489) : Fatal assert; application exiting
Apr 25 05:27:43 jupiter plasmashell[60187]: src/tier0/memstd.cpp (2489) : Fatal assert; application exiting
cyberpunkrocker-zero commented 3 years ago

Right, it is always that 747-sized chunk of memory

kakra commented 3 years ago

@kisak-valve Please adjust the labels, this is not Gentoo-specific, it seems to affect any distribution. Thus, it probably also affects officially supported distributions.

brunodrugowick commented 3 years ago

Thank you guys for all the detailed discussion in this thread. I'm having a similar issue and can help with information/logs if necessary.

However, someone mentioned Docker here at some point and I had two containers running. After removing the containers (and Docker itself since I actually don't need it in this machine) I didn't experience the issue anymore! I still have occasional stuttering but Steam won't crash anymore.

I'll test later if by just installing Docker and having it running again Steam will crash ¯_(ツ)_/¯

brunodrugowick commented 3 years ago

Even before I test Docker... I left the game running for a while and there's the same error again:

Assert( Assertion Failed: m_pInternalPipe->BRead failed ):/data/src/common/pipes.cpp:623
raQai commented 3 years ago

I was the one mentioning the docker issues. Not only with running containers but the plain service running lead to crashes. As a temporary workaround I disabled the service on startup and did not experience most of the issues anymore.

The OOM issue seems unrelated though since I still occasionally get one of those (mostly while the pre-cache process is running as @kakra mentioned)

kakra commented 3 years ago

I think we are seeing two different kind of crashes: One is related to having docker running and may be an issue with the network interfaces it creates (there are multiple reports for games having network issues with docker running), the other is the OOM issue. I'll rename this report to mention the OOM issue. For docker-related issues, you should open another report.

kakra commented 3 years ago

But I can confirm that the frequency of crashes is much lower after the background-fossilize process (shader pre-caching) is done (you can watch this in the settings dialog which shows a live status). After the pre-caching finished, the Steam process ran for 10 hours straight now while it previously would crash every half an hour or so. But it's still only one trigger of the problem, the Steam process may still crash while playing a game, and the frequency has vastly increased since my initial report. Most of the times, the client immediately updates itself when restarting it after a crash so it may be related to update search, too.

I've set it to look for updates only at night now. Let's see if this improves things and narrows down the issue.

kisak-valve commented 3 years ago

We have issue reports like #6751 / #7740 to track the docker-related crash. This issue might be a related to #7280.

kakra commented 3 years ago

@kisak-valve Yes, #7280 looks like a valid duplicate: I can actually see a lot of the same behavior described there. But this also seems to happen when no background fossilize is running at all. So I'll leave this open for a few more days to observe if this is a real duplicate, or if this is related purely to the background update process: After all, I observed logs showing that an update was found often just before the crash, and after restart, it would actually self-update the client.

Does the Steam client link to the fossilize lib, or does it just call it as an external binary? For me it looks like Steam is just running fossilize in external mode, and some pipe is connected to control the external process. Also, fossilize should not run at all while a game is running - except maybe it is also recording the shader state of games running: In that case it could make sense to split the fossilize process into a child process/fork so it doesn't crash the whole gaming session. After all, fossilize seems to be designed around the possibility to shut down unexpectedly (aka crashing) because some shader processing is known to segfault the process, or at least, the database is designed around recovering properly. So I'd expect that fossilize is expected to MAYBE crashing sometimes, and handle that gracefully. The Steam client should be able to handle similar problems gracefully, too.

I wonder why different people see the exact same allocated byte count in the error message (#7280 has a different size in one report, in most others, it's missing). Is this connected to the size of some shader file? Is Steam maybe trying to allocate this memory from shared memory? Because that is usually limited in Linux.

OTOH, maybe Steam should switch to being a 64-bit app. Many distributions already removed support for 32-bit only systems, and even playing modern games on 32-bit is not even an option. In the Steam hardware survey, 32-bit systems are below the 0.1% threshold.

cyberpunkrocker-zero commented 3 years ago

@kakra you may be onto something with that update thing. I restricted the update time outside the time period I normally run/use Steam, and sure enough, yesterday Steam client idled happily in the background the whole time I was using the computer (about 14 hours straight) without crashing. Usually it crashes with the '1970089747' OOM erron in much less than 6 hours.

I don't think #7280 has much relevance in my case. At least the 'Allow backgroud processing of Vulcan shaders' is, and has always been disabled.

kakra commented 3 years ago

To summarize, we probably see three distinct crash situations:

@kisak-valve So I think we have another distinct crash situation here. What do you think?

kakra commented 3 years ago

@cyberpunkrocker-zero I think I can confirm this: The client churns along nicely since multiple days now after I restricted updates to nightly time intervals.

cyberpunkrocker-zero commented 3 years ago

@kakra I restricted the update time to 6:00AM-9:00AM. If I start Steam before 9:00AM, it checks the updates, if any, and continues to work for the rest of the day without crashing. So it seems that update checks / updates does not crash Steam if done when the client is not yet started, but eventually OOM crash the client if done while it is running. Also, game updates work normally without crashing Steam if started manually outside the defined update time.

Fogapod commented 3 years ago

I don't have any update restrictions and I allow downloads during gameplay. I've only had 1 OOM during the past 3 days. Before this it could crash once an hour or even once every few minutes. So it happens in periods. One week it's good and the other it crashes every day. Do you think it's related to updates?

I also had OOMs in offline mode. Last time i remember was 16th february 2021:

***** OUT OF MEMORY! attempted allocation size: 1970089747 ****
src/tier0/memstd.cpp (2487) : Fatal Assertion Failed: OUT OF MEMORY
src/tier0/memstd.cpp (2487) : Fatal Assertion Failed: OUT OF MEMORY
src/tier0/memstd.cpp (2487) : Fatal assert; application exiting
src/tier0/memstd.cpp (2487) : Fatal assert; application exiting

I also noticed that these crashes happen during specific actions in games (there are more crashes, including crashes when no games are running, but these happened multiple times):

Both of these are native games, steam overlay is on. These happened multiple times in exact timing of mentioned events, so I don't think it's update related.

kakra commented 3 years ago

Please look at the console output of the crash (depends on your desktop environment / login manager, it may be logged to xsession.log, or to journalctl, or something else). Then look at https://github.com/ValveSoftware/steam-for-linux/issues/7630#issuecomment-827555620 to decide whether it's this crash situation or one of the other I identified.

In this case it looks like it's the crash that comes from the update check. Do you have the console log at hand the comes just before the crash?

Steam overlay is off here because it interfered badly with the (Vulkan) render surface in the past, leading to game freezes. That's probably fixed by now but I still have it turned off.

kakra commented 3 years ago

Restricting updates to a specific time period is no longer a work-around for me, the Steam client started crashing again, usually more often when new updates of Proton appear, or after downloading some bigger game updates.

hikarutilmitt commented 3 years ago

Finally got around to seeing the output from a terminal, to check and see what's been causing mine to crash, came across this issues, myself on Arch:

* OUT OF MEMORY! attempted allocation size: 4120 src/tier0/memstd.cpp (2489) : Fatal Assertion Failed: OUT OF MEMORY src/tier0/memstd.cpp (2489) : Fatal Assertion Failed: OUT OF MEMORY src/tier0/memstd.cpp (2489) : Fatal assert; application exiting src/tier0/memstd.cpp (2489) : Fatal assert; application exiting Installing breakpad exception handler for appid(steam)/version(1621394999) _ExitOnFatalAssert

This has been happening for me since when this issue was first reported, and has persisted across new hardware. I keep all of my Steam games on a separate SSD from my root and home partitions. I have had pre-caching in the background disabled for a good while since it was causing Tekken 7 to constantly rebuild its cache any time I ran it, so leaving it to simply keep its existing cache from merely running the game has been fine and stable. It's entirely possible the shadercache was stale but I'm not entirely certain this is the cause. I've just recently (just before posting this) deleted the existing cache and re-enabled pre-caching to see if anything changes.

I know it's been 20 days since the last report, but I figured reporting a number different from the 2GB of memory might raise some flags.

swedneck commented 3 years ago

Same experience as kakra, restricting updates used to help but lately it has ceased doing so.

ipr commented 3 years ago

Happens also with the beta (2021-01-30) just released:

But: 23: Assert( Assertion Failed: vecDBs.Count() > 0 ):/data/src/clientdll/shadercachemanager/shadercompilejob.cpp:116 227: Assert( Fatal Assertion Failed: OUT OF MEMORY ):/data/src/tier0/memstd.cpp:2487

I'm not sure why it shows "out of memory": According to my atop history, there were 20+ GB of memory available at the time of the crash:

Is there 32-bit code with malloc() of around 2GB (signed 32-bit int) somewhere? Edit: looks like this was already found earlier

Another might be if there is not large enough contiguous memory area for allocation (depending of flags, system configuration and what type of memory it is trying to allocate).

Often memory mapping of files would be a better choice that reading into large buffers but without knowing what the memory is used for that is hard to tell if it would apply here too.

sysms commented 3 years ago

I'm observing this crash after Steam (current stable) downloads large Shader caches (i.e. for Horizon Zero Dawn):

***** OUT OF MEMORY! attempted allocation size: 402653184 ****
src/tier0/memstd.cpp (2489) : Fatal Assertion Failed: OUT OF MEMORY
src/tier0/memstd.cpp (2489) : Fatal Assertion Failed: OUT OF MEMORY
src/tier0/memstd.cpp (2489) : Fatal assert; application exiting
src/tier0/memstd.cpp (2489) : Fatal assert; application exiting
Installing breakpad exception handler for appid(steam)/version(1621394999)
assert_20210606095532_168.dmp[2421]: Uploading dump (out-of-process)
/tmp/dumps/assert_20210606095532_168.dmp
_ExitOnFatalAssert
mash@ryzen:~$ assert_20210606095532_168.dmp[2421]: Finished uploading minidump (out-of-process): success = yes
assert_20210606095532_168.dmp[2421]: response: CrashID=bp-0a1d8c3b-e93f-447f-9244-43d8f2210606
assert_20210606095532_168.dmp[2421]: file ''/tmp/dumps/assert_20210606095532_168.dmp'', upload yes: ''CrashID=bp-0a1d8c3b-e93f-447f-9244-43d8f2210606''
kakra commented 3 years ago

Retrospectively, it makes sense that this seems to be related to shader cache updates...

BillFleming commented 3 years ago

I am also getting this sort of crash literally once or twice a day now when I start Steam. If Steam wasn't running for a few hours when I start it it then it will run for a few seconds/minutes then crash from some shader updates. But then I can relaunch steam after a few seconds and since the problem update is done it then runs ok and I can start a game. But then if I close steam, leave my PC and come back the next day Steam will crash again within a minute or 2 of starting it up.

This even happens when I have more than 32GB of free RAM on 64GB system. (Vega + RADV) For me this only started in the last ~1.5 weeks probably starting after the video foz related updates.

Saroumane commented 3 years ago

Ubuntu 21.04 here, no Docker used, 16GB of RAM, no Beta : Steam Client Jun 8 2021. About shaders : Enable Shader Pre-caching : checked, Allow background processing : not checked. About steam games auto-updates : restricted to 1hour/day from 11 AM to 12 PM. I have daily unattended steam client crashes, always during update time :

journalctl -S "2021-06-19 00:00"|grep steam|grep Seg
juin 19 11:03:17 host steam.desktop[3641]: /home/user/.steam/debian-installation/steam.sh: line 772:  4113 Segmentation fault      $STEAM_DEBUGGER $DEBUGGER_ARGS "$STEAMROOT/$STEAMEXEPATH" "$@"
juin 20 11:14:30 host steam.desktop[3571]: /home/user/.steam/debian-installation/steam.sh: line 772:  4069 Segmentation fault      $STEAM_DEBUGGER $DEBUGGER_ARGS "$STEAMROOT/$STEAMEXEPATH" "$@"
juin 22 11:00:41 host steam.desktop[6459]: /home/user/.steam/debian-installation/steam.sh: line 772:  7164 Segmentation fault      $STEAM_DEBUGGER $DEBUGGER_ARGS "$STEAMROOT/$STEAMEXEPATH" "$@"
juin 23 11:00:33 host steam.desktop[152925]: /home/user/.steam/debian-installation/steam.sh: line 772: 153375 Segmentation fault      $STEAM_DEBUGGER $DEBUGGER_ARGS "$STEAMROOT/$STEAMEXEPATH" "$@"
juin 23 11:46:40 host steam.desktop[172591]: /home/user/.steam/debian-installation/steam.sh: line 772: 172735 Segmentation fault      $STEAM_DEBUGGER $DEBUGGER_ARGS "$STEAMROOT/$STEAMEXEPATH" "$@"
juin 24 11:01:50 host steam.desktop[200936]: /home/user/.steam/debian-installation/steam.sh: line 772: 201080 Segmentation fault      $STEAM_DEBUGGER $DEBUGGER_ARGS "$STEAMROOT/$STEAMEXEPATH" "$@"
kakra commented 3 years ago

Latest Steam beta update mentions "Initial fixes for excess memory usage downloading shader depots". Is that related?

hikarutilmitt commented 3 years ago

Been several months and nothing has changed. Is there anything else we can do to provide more information on getting this solved? I can't imagine we'd want this still happening with SteamOS 3 and the Deck coming around in a few months.

Arch kernel 5.14.8-zen1-1-zen Ryzen 5 3600 RX 5600XT 16GB RAM Mesa 21.2.2

Really, nothing has changed other than kernel and mesa versions since I last posted about this.

cyberpunkrocker-zero commented 3 years ago

@hikarutilmitt I have almost exactly the same specs as you (on Arch, of course) and I haven't had any problems for several months. This seems to be fixed for anyone else, too. Are you sure you have the latest steam client? (latest client update 2 days ago)

hikarutilmitt commented 3 years ago

@hikarutilmitt I have almost exactly the same specs as you (on Arch, of course) and I haven't had any problems for several months. This seems to be fixed for anyone else, too. Are you sure you have the latest steam client? (latest client update 2 days ago)

I'm on the beta branch and it updates pretty frequently. I just after posting my last comment cleared the shader cache completely and restarted steam then re-enabled background processing to see if the recent changes just hadn't taken somehow. I'll put the error in again if it happens again.

EDIT: it's still happening, beta or no beta.


[2021-10-22 14:24:37] Nothing to do
Installing breakpad exception handler for appid(steam)/version(1634158817)
Installing breakpad exception handler for appid(steam)/version(1634158817)
***** OUT OF MEMORY! attempted allocation size: 4120 ****
src/tier0/memstd.cpp (2838) : OUT OF MEMORY
src/tier0/memstd.cpp (2838) : OUT OF MEMORY
src/tier0/memstd.cpp (2838) : Fatal assert; application exiting
src/tier0/memstd.cpp (2838) : Fatal assert; application exiting
Installing breakpad exception handler for appid(steam)/version(1634158817)
DBG DBG ffffffff DUMP_REQUESTED Assert( OUT OF MEMORY ):/data/src/tier0/memstd.cpp:2838

_ExitOnFatalAssert

If there's any other log to tell me what exactly is doing this so either I can fix some potential misconfiguration on my end or help maybe nail down the bug itself for anyone else, I'd be all for it.

kakra commented 3 years ago

That infamous bug is back for me again.

Last night, it crashed out of the blue, the system was idle, and Steam checked for updates:

Okt 23 05:38:09 jupiter steam[3047]: [2021-10-23 00:20:30] Opted in to client beta 'publicbeta' via beta file
Okt 23 05:38:09 jupiter steam[3047]: You are in the 'publicbeta' client beta.
Okt 23 05:38:09 jupiter steam[3047]: Looks like steam didn't shutdown cleanly, scheduling immediate update check
Okt 23 05:38:09 jupiter steam[3047]: [2021-10-23 00:20:31] Loading cached metrics from disk (/home/kakra/.local/share/Steam/package/steam_client_metrics.bin)
Okt 23 05:38:09 jupiter steam[3047]: [2021-10-23 00:20:31] Using the following download hosts for Public, Realm steamglobal
Okt 23 05:38:09 jupiter steam[3047]: [2021-10-23 00:20:31] 1. https://cdn.cloudflare.steamstatic.com, /client/, Realm 'steamglobal', weight was 100, source = 'update_hosts_cached.vdf'
Okt 23 05:38:09 jupiter steam[3047]: [2021-10-23 00:20:31] 2. https://cdn.akamai.steamstatic.com, /client/, Realm 'steamglobal', weight was 100, source = 'update_hosts_cached.vdf'
Okt 23 05:38:09 jupiter steam[3047]: [2021-10-23 00:20:31] 3. http://media.steampowered.com, /client/, Realm 'steamglobal', weight was 1, source = 'baked in'
Okt 23 05:38:09 jupiter steam[3047]: [2021-10-23 00:20:31] Checking for update on startup
Okt 23 05:38:09 jupiter steam[3047]: [2021-10-23 00:20:31] Suche nach verfügbaren Updates...
Okt 23 05:38:09 jupiter steam[3047]: [2021-10-23 00:20:31] Downloading manifest: https://cdn.cloudflare.steamstatic.com/client/steam_client_publicbeta_ubuntu12
Okt 23 05:38:09 jupiter steam[3047]: [2021-10-23 00:20:32] Download skipped by HTTP 304 Not Modified
Okt 23 05:38:09 jupiter steam[3047]: [2021-10-23 00:20:32] Nothing to do
Okt 23 05:38:09 jupiter steam[3047]: [2021-10-23 00:20:32] Installation wird überprüft...
Okt 23 05:38:09 jupiter steam[3047]: [2021-10-23 00:20:32] Performing checksum verification of executable files
Okt 23 05:38:09 jupiter steam[3047]: [2021-10-23 00:20:47] Verification complete
Okt 23 05:38:09 jupiter steam[3047]: [2021-10-23 00:42:51] Background update loop checking for update. . .
Okt 23 05:38:09 jupiter steam[3047]: [2021-10-23 00:42:51] Suche nach verfügbaren Updates...
Okt 23 05:38:09 jupiter steam[3047]: [2021-10-23 00:42:51] Downloading manifest: https://cdn.cloudflare.steamstatic.com/client/steam_client_publicbeta_ubuntu12
Okt 23 05:38:09 jupiter steam[3047]: [2021-10-23 00:42:52] Download skipped by HTTP 304 Not Modified
Okt 23 05:38:09 jupiter steam[3047]: [2021-10-23 00:42:52] Nothing to do
Okt 23 05:38:09 jupiter steam[3047]: ***** OUT OF MEMORY! attempted allocation size: 1970089747 ****
Okt 23 05:38:09 jupiter steam[3047]: src/tier0/memstd.cpp (2838) : OUT OF MEMORY
Okt 23 05:38:09 jupiter steam[3047]: src/tier0/memstd.cpp (2838) : OUT OF MEMORY
Okt 23 05:38:09 jupiter steam[3047]: src/tier0/memstd.cpp (2838) : Fatal assert; application exiting
Okt 23 05:38:09 jupiter steam[3047]: src/tier0/memstd.cpp (2838) : Fatal assert; application exiting
Okt 23 05:38:09 jupiter steam[3047]: Installing breakpad exception handler for appid(steam)/version(1634882040)
Okt 23 05:38:09 jupiter steam[133871]: assert_20211023053809_113.dmp[133871]: Uploading dump (out-of-process)
Okt 23 05:38:09 jupiter steam[133871]: /tmp/dumps/assert_20211023053809_113.dmp
Okt 23 05:38:09 jupiter steam[3047]: _ExitOnFatalAssert

When a crash happens in the middle of a game, the game is simply insta-killed together with Steam. This is very annoying. But I seem to have no log of that occurrence (although that happened maybe just a few hours before) so it might be unrelated to update checks.

The Steam client - as an essential "background" service - really needs to become ultra-stable. Maybe concerns need to be split out from the monolithic client, so the Steam API for games becomes a simple single-concern background service, the GUI should be able to be closed separately (so the game could stay running without having the relatively heavy GUI in the background, so it could even crash without taking down the API and the game), and the client updater should be an out-of-process component that is able to properly shutdown the single components and boot them up again.