ValveSoftware / steam-for-linux

Issue tracking for the Steam for Linux beta client
4.21k stars 174 forks source link

split lock detection spamming dmesg #8003

Open popey opened 3 years ago

popey commented 3 years ago

Your system information

CPU: model name : 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz GPU: 52:00.0 VGA compatible controller: NVIDIA Corporation GP107 [GeForce GTX 1050 Ti] (rev a1) Driver: 460.91.03

Please describe your issue in as much detail as possible:

While Steam is open, my syslog is spammed with x86/split lock detection thus:

[Wed Aug 25 21:18:32 2021] x86/split lock detection: #AC: CJobMgr::m_Work/1383782 took a split_lock trap at address: 0xf23d6263
[Wed Aug 25 21:18:33 2021] x86/split lock detection: #AC: CJobMgr::m_Work/1383767 took a split_lock trap at address: 0xf23d6263
[Wed Aug 25 21:18:35 2021] x86/split lock detection: #AC: CHTTPClientThre/1383923 took a split_lock trap at address: 0xf23d6263
[Wed Aug 25 21:20:34 2021] x86/split lock detection: #AC: CJobMgr::m_Work/1383766 took a split_lock trap at address: 0xf23d6263

If I close Steam, it stops.

Steps for reproducing this issue:

  1. Install Kubuntu 21.04 on a ThinkPad X1C9 (Carbon)
  2. Add an nVidia GPU in a Thunderbolt enclosure
  3. Run steam.
JakeMoe commented 2 years ago

I'm getting the same on my Dell Latitude 5520 laptop (i7 iGPU + Nvidia MX450 dGPU) running Gentoo.

facundoq commented 2 years ago

Same thing on Asus Vivobook (i5 1035G1 + nvidia mx350) on kubuntu 20.04, uname -r relevant parts:

5.11.0-37-generic #41~20.04.2-Ubuntu

whp199 commented 2 years ago

I'm getting the same on my Thinkpad X1 Extreme Gen 4 laptop (i7-11850 + RTX 3070 max-q) running Gentoo. Thousands of these per minute. dmesg log sample: [36016.234614] x86/split lock detection: #AC: CJobMgr::m_Work/29551 took a split_lock trap at address: 0xea549163 [36062.856553] x86/split lock detection: #AC: CJobMgr::m_Work/29578 took a split_lock trap at address: 0xea549163 [36063.053937] x86/split lock detection: #AC: CJobMgr::m_Work/29578 took a split_lock trap at address: 0xea549163 [36063.250999] x86/split lock detection: #AC: CJobMgr::m_Work/29578 took a split_lock trap at address: 0xea549163 [36063.447947] x86/split lock detection: #AC: CJobMgr::m_Work/29578 took a split_lock trap at address: 0xea549163 [36063.646125] x86/split lock detection: #AC: CJobMgr::m_Work/29578 took a split_lock trap at address: 0xea549163 [36063.843318] x86/split lock detection: #AC: CJobMgr::m_Work/29578 took a split_lock trap at address: 0xea549163 [36064.040236] x86/split lock detection: #AC: CJobMgr::m_Work/29578 took a split_lock trap at address: 0xea549163 [36064.237212] x86/split lock detection: #AC: CJobMgr::m_Work/29578 took a split_lock trap at address: 0xea549163 [36064.434420] x86/split lock detection: #AC: CJobMgr::m_Work/29578 took a split_lock trap at address: 0xea549163 [36064.632189] x86/split lock detection: #AC: CJobMgr::m_Work/29578 took a split_lock trap at address: 0xea549163 [36071.445435] split_lock_warn: 1 callbacks suppressed [36071.445437] x86/split lock detection: #AC: CJobMgr::m_Work/29551 took a split_lock trap at address: 0xea549163 [36109.158387] x86/split lock detection: #AC: CJobMgr::m_Work/29578 took a split_lock trap at address: 0xea549163 [36109.207848] x86/split lock detection: #AC: CJobMgr::m_Work/29550 took a split_lock trap at address: 0xea549163 [36109.306497] x86/split lock detection: #AC: CJobMgr::m_Work/34434 took a split_lock trap at address: 0xea549163 [36109.355958] x86/split lock detection: #AC: CJobMgr::m_Work/29578 took a split_lock trap at address: 0xea549163 [36109.405409] x86/split lock detection: #AC: CJobMgr::m_Work/29550 took a split_lock trap at address: 0xea549163 [36109.504461] x86/split lock detection: #AC: CJobMgr::m_Work/34434 took a split_lock trap at address: 0xea549163 [36109.553897] x86/split lock detection: #AC: CJobMgr::m_Work/29578 took a split_lock trap at address: 0xea549163 [36109.603366] x86/split lock detection: #AC: CJobMgr::m_Work/29550 took a split_lock trap at address: 0xea549163 [36109.702930] x86/split lock detection: #AC: CJobMgr::m_Work/34434 took a split_lock trap at address: 0xea549163 [36160.615124] x86/split lock detection: #AC: CJobMgr::m_Work/34434 took a split_lock trap at address: 0xea549163 [36160.812676] x86/split lock detection: #AC: CJobMgr::m_Work/34434 took a split_lock trap at address: 0xea549163 [36161.009782] x86/split lock detection: #AC: CJobMgr::m_Work/34434 took a split_lock trap at address: 0xea549163 [36161.207078] x86/split lock detection: #AC: CJobMgr::m_Work/34434 took a split_lock trap at address: 0xea549163 [36161.404459] x86/split lock detection: #AC: CJobMgr::m_Work/34434 took a split_lock trap at address: 0xea549163 [36161.602186] x86/split lock detection: #AC: CJobMgr::m_Work/34434 took a split_lock trap at address: 0xea549163 [36161.800162] x86/split lock detection: #AC: CJobMgr::m_Work/34434 took a split_lock trap at address: 0xea549163 [36161.998170] x86/split lock detection: #AC: CJobMgr::m_Work/34434 took a split_lock trap at address: 0xea549163 [36177.196524] x86/split lock detection: #AC: CJobMgr::m_Work/29551 took a split_lock trap at address: 0xea549163 [36182.346968] x86/split lock detection: #AC: CJobMgr::m_Work/29551 took a split_lock trap at address: 0xea549163 [36207.067028] x86/split lock detection: #AC: CJobMgr::m_Work/29551 took a split_lock trap at address: 0xea549163 [36207.315238] x86/split lock detection: #AC: CJobMgr::m_Work/34434 took a split_lock trap at address: 0xea549163 [36207.512238] x86/split lock detection: #AC: CJobMgr::m_Work/34434 took a split_lock trap at address: 0xea549163 [36207.710316] x86/split lock detection: #AC: CJobMgr::m_Work/34434 took a split_lock trap at address: 0xea549163 [36207.908317] x86/split lock detection: #AC: CJobMgr::m_Work/34434 took a split_lock trap at address: 0xea549163 [36208.105540] x86/split lock detection: #AC: CJobMgr::m_Work/34434 took a split_lock trap at address: 0xea549163 [36208.303372] x86/split lock detection: #AC: CJobMgr::m_Work/34434 took a split_lock trap at address: 0xea549163 [36208.500928] x86/split lock detection: #AC: CJobMgr::m_Work/34434 took a split_lock trap at address: 0xea549163 [36208.699036] x86/split lock detection: #AC: CJobMgr::m_Work/34434 took a split_lock trap at address: 0xea549163 [36208.896665] x86/split lock detection: #AC: CJobMgr::m_Work/34434 took a split_lock trap at address: 0xea549163

uname -a: Linux x1e 5.14.10-gentoo-ligma #1 SMP Fri Oct 8 20:38:34 CDT 2021 x86_64 11th Gen Intel(R) Core(TM) i7-11850H @ 2.50GHz GenuineIntel GNU/Linux

Latest version of steam client:

Steam client version (build number or date): Oct 6th, 2021 at 18:51:29 

Steam API: v020 Steam Package Versions: 1633666232 Distribution (e.g. Ubuntu): Gentoo Opted into Steam client beta?: [Yes/No] No. Have you checked for system updates?: [Yes/No] Yes.

CPU: model name : 11th Gen Intel(R) Core(TM) i7-11850H @ 2.50GHz GPU: VGA compatible controller: NVIDIA Corporation GA104M [GeForce RTX 3070 Mobile / Max-Q] (rev a1) Driver: 470.63.01

Lepr0 commented 2 years ago

Same logs

[ 5656.175277] x86/split lock detection: #AC: CHTTPClientThre/21899 took a split_lock trap at address: 0x566a9e23 [ 5659.223628] x86/split lock detection: #AC: CHTTPClientThre/22014 took a split_lock trap at address: 0xea44b163 [ 5659.245241] x86/split lock detection: #AC: CJobMgr::m_Work/22000 took a split_lock trap at address: 0xea44b163

on my

Host: ZenBook UX325EA OS: Ubuntu 21.10 x86_64 Kernel: 5.13.0-20-generic CPU: 11th Gen Intel i7-1165G7 (8) @ 4.700GHz GPU: Intel TigerLake-LP GT2 [Iris Xe Graphics] Memory: 5427MiB / 15699MiB

Once it was hard freeze after that. Not good.

paboum commented 2 years ago

Happens for me too: Nov 11 03:49:25 ___ kernel: x86/split lock detection: #AC: CJobMgr::m_Work/32423 took a split_lock trap at address: 0xf23f7163 Nov 11 03:49:25 ___ kernel: x86/split lock detection: #AC: CJobMgr::m_Work/32402 took a split_lock trap at address: 0xf23f7163 Nov 11 03:49:25 ___ kernel: x86/split lock detection: #AC: CJobMgr::m_Work/32423 took a split_lock trap at address: 0xf23f7163 Nov 11 03:49:25 ___ kernel: x86/split lock detection: #AC: CJobMgr::m_Work/32423 took a split_lock trap at address: 0xf23f7163 Explained here: https://lwn.net/Articles/786239/

Once more, Valve programmers caught on not knowing how to code.

To stop kernel from killing those suboptimal processes, I added clearcpuid=split_lock_detect to my grub config.

kakra commented 2 years ago

I'm seeing this, too. Steam Beta as of 2021-12-08:

[84130.796009] x86/split lock detection: #AC: CHTTPClientThre/3752 took a split_lock trap at address: 0xf20bb273
[84130.944128] x86/split lock detection: #AC: CJobMgr::m_Work/3261 took a split_lock trap at address: 0xf20bb273
[84133.751362] x86/split lock detection: #AC: CJobMgr::m_Work/3477 took a split_lock trap at address: 0xf20bb273
[84135.721442] x86/split lock detection: #AC: CJobMgr::m_Work/3477 took a split_lock trap at address: 0xf20bb273
[84135.967341] x86/split lock detection: #AC: CJobMgr::m_Work/3477 took a split_lock trap at address: 0xf20bb273
[84136.460236] x86/split lock detection: #AC: CJobMgr::m_Work/3477 took a split_lock trap at address: 0xf20bb273
[84136.706968] x86/split lock detection: #AC: CJobMgr::m_Work/3477 took a split_lock trap at address: 0xf20bb273
[84137.693282] x86/split lock detection: #AC: CJobMgr::m_Work/3477 took a split_lock trap at address: 0xf20bb273
[84138.186812] x86/split lock detection: #AC: CJobMgr::m_Work/3477 took a split_lock trap at address: 0xf20bb273
[84139.765611] x86/split lock detection: #AC: CJobMgr::m_Work/3352 took a split_lock trap at address: 0xf20bb273
[84141.196171] x86/split lock detection: #AC: CJobMgr::m_Work/3683 took a split_lock trap at address: 0xf20bb273
[84142.527785] x86/split lock detection: #AC: CHTTPClientThre/3270 took a split_lock trap at address: 0xf20bb273
[84144.648846] x86/split lock detection: #AC: CJobMgr::m_Work/3683 took a split_lock trap at address: 0xf20bb273
[84148.350382] x86/split lock detection: #AC: CJobMgr::m_Work/3683 took a split_lock trap at address: 0xf20bb273
[84149.483809] x86/split lock detection: #AC: CJobMgr::m_Work/3262 took a split_lock trap at address: 0xf20bb273

This problem wasn't present with my old Ivybridge system but since I upgraded to Alder Lake, I'm seeing this.

Split lock detected is thus probably a feature of modern CPUs and NOT a problem hitting ONLY modern CPUs, it's also present in older CPUs. According to kernel commits, this detection was added to find bad process behavior which negatively affects the performance of the whole system (even unrelated processes). I thus believe Valve should fix this, especially since Steam is about gaming, and gaming is about performance. The linked LWN aricle (https://github.com/ValveSoftware/steam-for-linux/issues/8003#issuecomment-965956455) indicates that fixing this may be as easy as recompiling with properly adjusting alignment.

However, I don't think processes are killed by the kernel as suggested in the previous comment: In my logs, I see repeating PID patterns which indicates that the same threads take the trap over and over again, the kernel would not recycle PIDs in a way that would explain this. The LWN article also says, killing offending processes can be one way to address the problem. Maybe it becomes default in the future, so it should be fixed sooner than later.

         -/oyddmdhs+:.                kakra@jupiter
     -odNMMMMMMMMNNmhy+-`             -------------
   -yNMMMMMMMMMMMNNNmmdhy+-           OS: Gentoo Base System release 2.7 x86_64
 `omMMMMMMMMMMMMNmdmmmmddhhy/`        Host: Z690 Pro RS
 omMMMMMMMMMMMNhhyyyohmdddhhhdo`      Kernel: 5.15.6-gentoo
.ydMMMMMMMMMMdhs++so/smdddhhhhdm+`    Uptime: 23 hours, 43 mins
 oyhdmNMMMMMMMNdyooydmddddhhhhyhNd.   Packages: 2046 (emerge), 13 (flatpak)
  :oyhhdNNMMMMMMMNNNmmdddhhhhhyymMh   Shell: fish 3.1.2
    .:+sydNMMMMMNNNmmmdddhhhhhhmMmy   Resolution: 1920x1080, 3840x2160, 3840x2160
       /mMMMMMMNNNmmmdddhhhhhmMNhs:   DE: Plasma 5.23.4
    `oNMMMMMMMNNNmmmddddhhdmMNhs+`    WM: KWin
  `sNMMMMMMMMNNNmmmdddddmNMmhs/.      Theme: Breeze Light [Plasma], Breeze [GTK2/3]
 /NMMMMMMMMNNNNmmmdddmNMNdso:`        Icons: [Plasma], breeze [GTK2/3]
+MMMMMMMNNNNNmmmmdmNMNdso/-           Terminal: konsole
yMMNNNNNNNmmmmmNNMmhs+/-`             Terminal Font: Fantasque Sans Mono 14
/hMMNNNNNNNNMNdhs++/-`                CPU: 12th Gen Intel i7-12700K (20) @ 6.300GHz
`/ohdmmddhys+++/:.`                   GPU: NVIDIA GeForce GTX 1660 Ti
  `-//////:--.                        Memory: 8330MiB / 31885MiB
reanimus commented 2 years ago

I'm seeing this on my Framework laptop as well, using Arch Linux.

                   -`                    animus@Xenon 
                  .o+`                   ------------ 
                 `ooo/                   OS: Arch Linux x86_64 
                `+oooo:                  Host: Framework FRANBMCP0C 
               `+oooooo:                 Kernel: 5.16.0-rc4-next-20211210-1-next-git-06579-gea922272cbe5 
               -+oooooo+:                Uptime: 6 days, 13 hours, 11 mins 
             `/:-:++oooo+:               Packages: 1065 (pacman) 
            `/++++/+++++++:              Shell: bash 5.1.12 
           `/++++++++++++++:             Resolution: 2256x1504 
          `/+++ooooooooooooo/`           DE: GNOME 41.2 (Wayland) 
         ./ooosssso++osssssso+`          WM: Mutter 
        .oossssso-````/ossssss+`         WM Theme: Arc-Dark 
       -osssssso.      :ssssssso.        Theme: WhiteSur-dark [GTK2/3] 
      :osssssss/        osssso+++.       Icons: Papirus-Dark [GTK2/3] 
     /ossssssss/        +ssssooo/-       Terminal: gnome-terminal 
   `/ossssso+/:-        -:/+osssso+-     CPU: 11th Gen Intel i7-1185G7 (8) @ 4.800GHz 
  `+sso+:-`                 `.-/+oso:    GPU: Intel TigerLake-LP GT2 [Iris Xe Graphics] 
 `++:.                           `-/+/   Memory: 8931MiB / 64098MiB 
 .`                                 `/

dmesg:

...
[29941.583864] x86/split lock detection: #AC: CHTTPClientThre/119074 took a split_lock trap at address: 0xe6d2f273
[29941.584627] x86/split lock detection: #AC: CHTTPClientThre/119074 took a split_lock trap at address: 0xe6d2f273
[29941.586852] x86/split lock detection: #AC: CHTTPClientThre/119074 took a split_lock trap at address: 0xe6d2f273
[29946.171847] split_lock_warn: 85 callbacks suppressed
[29946.171851] x86/split lock detection: #AC: CHTTPClientThre/119313 took a split_lock trap at address: 0xe6d2f273
[29946.575989] x86/split lock detection: #AC: CHTTPClientThre/119313 took a split_lock trap at address: 0xe6d2f273
[29946.628420] x86/split lock detection: #AC: CHTTPClientThre/119313 took a split_lock trap at address: 0xe6d2f273
...
Shished commented 2 years ago

I'm getting this on Arch with Intel Alder Lake CPU. Using Steam beta.

bxkx commented 2 years ago

Also started happening for me quite recently... super annoying. Spams for almost the entire duration Steam is running.

kakra commented 2 years ago

@kisak-valve This affects all distros with a kernel that has split lock detection enabled, and with CPUs that can detect and report this situation to the kernel (although earlier CPUs might be affected as well). Future kernels will eventually kill such processes, currently it's a warning only. Most of these logs come from Steam client processes itself, please fix it. I'm also seeing this with some Uplay titles but it totally disappears in the log noise generated by the Steam client.

I'm currently using split_lock_detect=off as a kernel parameter to stop the log spamming but this isn't really helpful: The message is there to point to a situation degrading performance of the whole CPU.

t-8ch commented 2 years ago

Beginning with 5.19 the kernel will "make life miserable for split lockers".

endrift commented 2 years ago

Getting this on my machine too, and it seems to be periodically causing various other drivers to time out operations sometimes, including, but not limited to:

This is bad enough that it could cause data corruption in various cases, potentially on the Steam Deck too.

It's easy to trigger this repeatedly just by doing Steam Remote Play. Which will sometimes even crash as a result of this.

In fact, it always crashes on exit (but it's silent outside of dmesg):

[197436.256887] streaming_clien[2629290]: segfault at 55b709e5c ip 000055b70825b744 sp 00007f92605f5d50 error 4 in streaming_client[55b707f38000+b86000]

Always in the same place too, modulo ASLR

Plagman commented 2 years ago

Can you re-test on the Steam Client Beta?

https://steamcommunity.com/groups/SteamClientBeta/announcements/detail/3387287522102609359

kakra commented 2 years ago

Can you re-test on the Steam Client Beta?

I can still see it but it seems much less noisy:

[  112.572659] x86/split lock detection: #AC: CHTTPClientThre/5662 took a split_lock trap at address: 0xf21846d3

Only one occurence so far. Steam Client Beta 2022-07-21

Plagman commented 2 years ago

Thanks, likely that some uses were missed. Will keep looking for them.

kakra commented 2 years ago

Some Assassins Creed games also throw that message but I'm not sure if wine or the Steam client could do something about it. In the light of future kernels killing such processes, how could that be handled? I'm not sure if Ubisoft would be interested in fixing such things, it's probably a non-issue under Windows?

Plagman commented 2 years ago

The split lock can come from the game process even if it's caused by Steam (eg. overlay locking primitives), so I was hoping that reports of game instances would go away after this fix. If some pre-existing games indeed rely on split locks in their own code, I think we'll have to discuss the situation with upstream further and alert them to the fact there are pre-existing applications that are not under active maintenance that Linux desktop users still want to run. This might be a case of desktop-oriented distributions having to disable the mitigations by default.

kakra commented 2 years ago

Okay, so I'll retest the games I've seen logging this in the past - and report back here? Or per game?

kakra commented 2 years ago

Here are two other occurrences, one with different address, one with different thread name:

[272502.307107] x86/split lock detection: #AC: CIPCServer::Thr/5510 took a split_lock trap at address: 0xf218472d
[272502.330099] x86/split lock detection: #AC: CJobMgr::m_Work/5656 took a split_lock trap at address: 0xf21846d3
kakra commented 2 years ago

These seem the only messages left after a reboot, HTH:

[   26.485020] x86/split lock detection: #AC: CHTTPClientThre/3024 took a split_lock trap at address: 0x565fc3e3
[   49.462890] x86/split lock detection: #AC: CHTTPClientThre/3576 took a split_lock trap at address: 0xf1e496d3
[   49.462923] x86/split lock detection: #AC: CHTTPClientThre/3575 took a split_lock trap at address: 0xf1e496d3

No game was started, the client just booted (and probably did its thing with fossilize and maybe spawning some prefix updates or whatever spawns wine processes after reboot).

ljrk0 commented 1 year ago

In Linux 6.2 the kernel will actively punish split locks and would need kernel.split_lock_mitigate=0 set as kernel parameter to disable this behavior. Steam fixing this would be highly appreciated.

https://lwn.net/Articles/911219/

HBRJZ commented 1 year ago

While the amount has gotten less, these still happen as of today:

x86/split lock detection: #AC: vulkandriverque/12320 took a split_lock trap at address: 0xf6cf9c47
x86/split lock detection: #AC: CHTTPClientThre/43404 took a split_lock trap at address: 0xe86df6d3
x86/split lock detection: #AC: CIPCServer::Thr/45705 took a split_lock trap at address: 0xe860b72d
x86/split lock detection: #AC: CJobMgr::m_Work/45708 took a split_lock trap at address: 0xe860b6d3
x86/split lock detection: #AC: ThreadedValidat/46251 took a split_lock trap at address: 0xe860b6d3
x86/split lock detection: #AC: CSystemManager:/11895 took a split_lock trap at address: 0xe872a19f

System information

System information

kakra commented 1 year ago

Actually, since the introduction of the new big picture mode into the client (although I don't use it), the spamming has increased again. But this is only one part of the problem, active punishing from the kernel will probably just throttle down the Steam client itself which shouldn't be such of a big issue (except the split locks are also in the API code games are using).

The bigger problem is games itself which Steam cannot do much about. Unless Microsoft introduces some similar way of punishing split locks in Windows, game devs won't fix it. And even if, what about the older/legacy games? Gamers and single user desktops probably just should use kernel.split_lock_mitigate=0.

The punishing is about preventing one user process from slowing down processes of other users. This is a non-issue in single user systems like when you are gaming on a desktop: The game mostly slows itself down by using split locks, and it's built around this performance characteristic. We should just ensure that other processes running on the system don't introduce additional performance costs - and that's why the Steam client should avoid these as much as possible.

HBRJZ commented 1 year ago

And one more:

x86/split lock detection: #AC: CNet Encrypt:0/13152 took a split_lock trap at address: 0xe867319f

luisalvarado commented 1 year ago

Got this one today. When witcher 3 crashed it showed this:

[ 1771.344713] x86/split lock detection: #AC: CSystemManager:/13367 took a split_lock trap at address: 0xf15c619f

And when I was playing Bendy 1 it showed this (But did not crash, I just did ALT+F4) [ 2309.763555] x86/split lock detection: #AC: Bendy and the I/16798 took a split_lock trap at address: 0x3f40f64

HBRJZ commented 1 year ago

And some more right after starting Steam, it downloaded some shaders I guess:

x86/split lock detection: #AC: CContentUpdateC/11219 took a split_lock trap at address: 0xee65019f
x86/split lock detection: #AC: CContentUpdateC/11220 took a split_lock trap at address: 0xee65019f
x86/split lock detection: #AC: CContentUpdateC/11258 took a split_lock trap at address: 0xee65019f
x86/split lock detection: #AC: CContentUpdateC/11259 took a split_lock trap at address: 0xee65019f
x86/split lock detection: #AC: CContentUpdateC/11261 took a split_lock trap at address: 0xee65019f
x86/split lock detection: #AC: CContentUpdateC/11262 took a split_lock trap at address: 0xee65019f
x86/split lock detection: #AC: CContentUpdateC/11263 took a split_lock trap at address: 0xee65019f
x86/split lock detection: #AC: CContentUpdateC/11264 took a split_lock trap at address: 0xee65019f
x86/split lock detection: #AC: CContentUpdateC/11266 took a split_lock trap at address: 0xee65019f
kakra commented 1 year ago

I'm currently seeing mostly this address (but a lot of them):

[499682.137546] x86/split lock detection: #AC: CJobMgr::m_Work/1429853 took a split_lock trap at address: 0xf1794f0f
[499682.186676] x86/split lock detection: #AC: CJobMgr::m_Work/1442500 took a split_lock trap at address: 0xf1794f0f
[499692.555101] x86/split lock detection: #AC: CJobMgr::m_Work/1442500 took a split_lock trap at address: 0xf1794f0f
[499692.703426] x86/split lock detection: #AC: CJobMgr::m_Work/1429612 took a split_lock trap at address: 0xf1794f0f

Steam package version: 1671133406

It looks like these are the only ones left currently, at least for an idle client:

# dmesg -t|grep "split lock"|sed 's#/[0-9]\+#/PID#'|sort -u
x86/split lock detection: #AC: CJobMgr::m_Work/PID took a split_lock trap at address: 0xf1794f0f
luisalvarado commented 1 year ago

What worked for me (At least for now) was adding to /etc/default/grub the following:

GRUB_CMDLINE_LINUX_DEFAULT="split_lock_detect=off"

The split_lock_detect off solved the crashing or closing of the app in a rather abrupt way. I also read the following here for the 6.2 Kernel https://www.phoronix.com/news/Linux-Splitlock-Hurts-Gaming

kakra commented 1 year ago

What worked for me (At least for now) was adding to /etc/default/grub the following:

Yes and no: that successfully prevents the kernel from complaining, or adding any additional performance penalties or even kill a process. But it does also silently ignore the hardware-based performance hit that comes with that incident. This is not about silencing a kernel message, it's about removing the CPU-wide performance hit that comes with that situation by avoiding it in the code causing it (Steam in this case).

Silencing the message does not prevent the performance hit that this message actually tries to point at (although, kernel 6.2 actually adds an artificial performance penalty for the process causing the situation in favor of not slowing down other processes in the system, which is bad for games which usually cause this situation, and you can actually avoid that additional artificial performance cost by adding the parameter but you cannot avoid the hardware performance hit that comes with it in the first place).

You actually want Steam to not cause bus locks because a lock operation crosses a cache line, this will slow down your game or cause micro stutters if Steam does it in the background. It actually affects all processes running in parallel. If a game does it, this is acceptable (because the game is designed around that performance characteristic and it is the only foreground process you care about), but if background processes do that, it will hurt performance of processes potentially important to you.

luisalvarado commented 1 year ago

I understand the silent part (not showing when checking dmesg) but how is it explained that only when O have that, I can play for example csgo, Witcher 3, cyberpunk for hours and without it O don't even last between 2 to 5 minutes before a crash happens. Only thing I changed was that. In regards to the penalty I would not know, all I was able to check was, if I have it, it does not crash or at least it does not crash for several continuous playing hours. If I remove it you can be sure I will never pass 5 minutes.

Could there be something else related to kernel 5.19 on Ubuntu 22.10?

luisalvarado commented 1 year ago

Also for the steam part, I am 200% with you that either the game or steam should handle it and fix it.

JulianGro commented 9 months ago

There is one more lock that seems to not be mentioned in here:

x86/split lock detection: #AC: CNet Encrypt:0/17936 took a split_lock trap at address: 0xe732616f

This is right when starting Steam. Not sure if this is related to Steam temporarily freezing my computer when starting.

Steam-Version:  1702079146
Steam-Client: Build-Datum:  Fr., 8. Dez. 1:33 UTC -08:00
Steam: Webbuild-Datum:  Sa., 9. Dez. 0:30 UTC -08:00

EDIT: Actually I ran across more

x86/split lock detection: #AC: IPC:CSteamEngin/17834 took a split_lock trap at address: 0xe73261aa
vitacell commented 9 months ago

Hi guys. I bought new laptop with i5-1135, tested it with Windoze10, all ran fine. Installed ArchLinux and Steam. Some Valve's Source engine games I play were working very very bad on ArchLinux. We are talking about more than 200fps on Windoze10, vs 30fps on GNU/Linux, for same game and same config. This crap happens because some geniustard Intel un-engineer Tony Luck decided to break userspace software, the kind of people who love to break other people's ficnished and working software just for fun. I even can not understand how things like this can get accepted into the kernel. What I found is that it affects at least Valve's Source engine based games, like Day Of Defeat Source, Counter-Strike Source, Portal1. Those run with 30 fps. I wasted the whole day trying to fix already working things. The other users maybe go back to Windows (faster painless fix, sadly).

JulianGro commented 9 months ago

@vitacell I fail to see how this fits here. This is an issue tracker for Steam, and split-locks should not be used since they slow the system down significantly. Valve seems to understand this and has removed almost all their uses for split-locks in Steam as far as I can tell.

vitacell commented 9 months ago

@vitacell I fail to see how this fits here. This is an issue tracker for Steam, and split-locks should not be used since they slow the system down significantly. Valve seems to understand this and has removed almost all their uses for split-locks in Steam as far as I can tell.

"as far as I can tell"? Valve's Source engine based games are still suffering from that, and they don't care very much about it. And users left on their own. It's not so hard to add "split_lock_detect=off", yeah. But it's not funny buying new hardware, and play old games at 30fps, then wasting your whole day trying ton of fixes and workarounds, trying to figure out what is happening. The quick fix is to install Windows, and this is what usually happens.

kakra commented 9 months ago

This throttling has actually been introduced because split locks are expensive for performance. This can be a real problem on cloud machines when someone accidentally or on purpose floods the system with split locks. The "fix" by the kernel devs was to throttle the processes causing the split locks, and it was also introduced so developers fix their bad behaving software. So this is actually a valid and proper fix and does not break user-space. But the problem is that especially Windows games are having this exact bad behavior. So I suggest desktop- or game-focused distributions should really ship with a kernel turning the throttling off by default.

You shouldn't point to the kernel devs, actually the kernel never tries to break user-space, such commits are usually considered bugs or regressions then. But in this case, it prevents a real problem and penalizes the causing processes, so it is a fix for a performance regression caused by bad behaving processes.

You should rather ask distributions to turn off split-lock detection by default, at least for game-focused kernels because old games won't be fixed, and current games won't be fixed either because Windows does no penalization. With such a kernel, this stops penalizing the games although the CPU still performs bad in split-lock situations. Just the extra penalty to the causing process would be prevented.

And yes, Steam has removed most if not all split-lock uses in the Steam client itself but that doesn't magically fix games that use split-locks.

We can only hope that game devs consider Linux performance (through the Steam Deck probably) and fix their games to not use split-locks. But for this to happen, it's probably better to not ship a kernel with split-lock detection disabled by default. So this is a double-edged sword.

RobusTetus commented 5 days ago

Is this being worked on? It takes just opening Steam and letting games update. No need to start any game and dmesg already reports following:

[ 1353.685982] x86/split lock detection: #AC: CHTTPClientThre/7356 took a split_lock trap at address: 0xe988b1ef
[ 1359.668839] warning: `ThreadPoolForeg' uses wireless extensions which will stop working for Wi-Fi 7 hardware; use nl80211
[ 1375.266097] x86/split lock detection: #AC: CHTTPClientThre/7673 took a split_lock trap at address: 0xe988b1ef
[ 1379.850865] x86/split lock detection: #AC: CHTTPClientThre/7772 took a split_lock trap at address: 0xe988b1ef
[ 1473.812157] x86/split lock detection: #AC: CHTTPClientThre/8044 took a split_lock trap at address: 0x56646d1f
[ 1621.113060] x86/split lock detection: #AC: IPC:CSteamEngin/7341 took a split_lock trap at address: 0xe988b22a

This is on Steam in flatpak, but it does not matter how steam is installed (deb,rpm,flatpak).