Closed ghost closed 1 year ago
That'd be really helpful. I have 12 cores / 24 threads and running fossilize on Path of Exile takes like forever (like 20-30 minutes) with fossilize using 6 threads only.
I've found a way to change the thread count, at your own risk:
Open up this url in a browser: steam://open/console
In the input field at the bottom, enter this (replace 12 by your desired thread count):
unShaderBackgroundProcessingThreads 12
Edit: I should add that this change is not persistent, unfortunately.
if you are using linux the command for opening the console is:
steam -console
BTW: I am using a ryzen 2700X with 8 cores 16 threads and I would like to use all of them because thats why I bought a cpu with that core count.
Thanks for your work :)
Hi, this is really a problem.
Some games like the latest Metro Exodus will take up to 1 GB RAM per fossilize_replay thread:
Leading to dangerous amounts of memory being used and the OoM getting involved and starting to kill user applications left and right because Fossilize monopolizes all the machine resources:
I'm on a 8/16 Core/Threads platform and although resources are there to be used, this is definitely to much.
Please do expose a way do limit that in the UI.
Can we have a steam startup parameter so this can be set at steam start so the first shaders use the wanted number of threads?
The high memory usage should have been fixed by now, and the IO pressure should auto-adjust nicely when you're running a kernel that has /proc/pressure
, it also uses lower IO priority and cache/readahead hinting now, CPU uses batch processing with very low priority now thus having less impact on foreground interactive processes.
I have 12 threads available and want Steam to use them all so... I want to define the number of threads used.
Possible UI modification to please us:
Thanks to however don't ignore this. Or make it so it doesn't use only 25% at best of our CPU. We're not playing games when it process shaders in the background (or foreground), so no need to preserve CPU usage, especially nowadays where people have multi-core CPU.. Imagine some people have 32 threads, I bet they may not even be using 8 to process shaders, what a shame!
@kakra
Hi and thanks for your work.
it also uses lower IO priority and cache/readahead hinting now, CPU uses batch processing with very low priority now thus having less impact on foreground interactive processes.
In all circumstances or only when doing background shader processing? If in the former case, wouldn't it be desirable that the priority be left to default since user has directly expressed interest in launching the game as fast as possible?
As far as I know, the fossilize author added a protocol option to the control channel for the Steam client to bump up the number of threads to the maximum on request (e.g., when the user wants to launch that game). But it isn't used by the client.
In theory, the fossilize process has two launch modes: background and foreground. In background mode it is launched with low thread count to not disrupt foreground processes. But I think the latest updates work well enough that this artificial limitation should no longer be needed: It runs in ultra low priority mode and automatically throttles immediately when other processes need IO.
Foreground mode is launched when the Steam client actually prepares shaders in foreground because the user requested starting the game. It uses all threads in this mode by default.
The problem may be migrating from background to foreground mode, e.g. when background mode is already running for the game you're launching right now.
As stated above:
I've found a way to change the thread count, at your own risk: Open up this url in a browser:
steam://open/console
In the input field at the bottom, enter this (replace 12 by your desired thread count):unShaderBackgroundProcessingThreads 12
But that is not persistent, it is reset when restarting the client.
I think it makes sense to occupy only 80% of the total CPU capacity in background mode, and that should be default when not running on battery. That is, 6 threads on a 8 thread system, or in pseudo code min(1, nthreads * 4 / 5)
. The implemented IO throttle will prevent it from overshooting, so it won't overwhelm your IO capacity and run on fewer threads automatically when needed. It also schedules itself as batch process so interactive processes will be preferred by scheduling.
This also means, even when you configure 12 threads, it may run on way fewer threads if IO latency bumps up due to this. Tests and issue reports have previously shown that IO latency and RAM usage were had very high performance impact in background mode.
My own tests showed that after all these improvements, foreground mode can now use 99% of the CPU capacity without disrupting interactive response and latency of the system. So yeah, it's probably time to bump up the amount of background processes by default for systems not running on battery.
But keep in mind that foreground processing may be limited by other work the system is doing, and it is very sensitive to IO latency it introduces. It may be best to put shader caches on a separate partition/device or on SSD to keep IO pressure low. Personally, I've moved my shader caches to a dedicated XFS partition cached on SSD via bcache. Seems to work well, shader pre-caching in foreground is quite fast, background stays slow because it is artificially limited to two threads. But that limit is probably enforced by the Steam client itself, not fossilize.
If you're interested in technical details, the changes are here:
According to it, it should log IO stalls. You could look into the log to see if that's true for your system in foreground mode.
Especially https://github.com/ValveSoftware/Fossilize/commit/200b19c319e2872415d74b5d3479e1624d748bc6 means that fossilize will completely pause (except one thread probably) if you're doing other stuff in parallel, like copying big files, or recording video, because IO pressure will shoot up and fossilize compensates by stopping what it's doing. That's because we found the NVIDIA driver to do a lot of inefficient IO while caching shaders: letting fossilize throttle threads down immediately in that case gives the system some room to breathe, and in the end shader processing was faster. But it doesn't work well with other unrelated concurrent IO.
Would be nice if the Steam client supported these knobs (maybe hidden behind some "advanced/expert" option), additionally to a thread count setting:
else if (strcmp(command, "IO_STALL_AUTO_ADJUST ON") == 0)
Global::target_running_processes_io_stall = true;
else if (strcmp(command, "IO_STALL_AUTO_ADJUST OFF") == 0)
Global::target_running_processes_io_stall = false;
else if (strcmp(command, "DIRTY_PAGE_BLOAT_AUTO_ADJUST ON") == 0)
Global::target_running_processes_dirty_pages = true;
else if (strcmp(command, "DIRTY_PAGE_BLOAT_AUTO_ADJUST OFF") == 0)
Global::target_running_processes_dirty_pages = false;
@kakra Thanks a lot for your explanations and code dive. It's much clearer now!
@DistantThunder, referring to https://github.com/ValveSoftware/steam-for-linux/issues/7283#issuecomment-824037541:
The memory bloat should be fixed since back in April, so threads could be bumped up. Additionally, as stated above, countermeasures have been deployed against bloating dirty page cache and overwhelming the IO capacity of the system, simply by auto-adjusting thread count down on pressure, and bumping it up again when pressure reduces (using fuzzy logic) - so it will auto stabilize at a thread count that your system can handle under its current load. Thus, it should be safe if you bump up the thread count via Steam console.
Replying to https://github.com/ValveSoftware/steam-for-linux/issues/7283#issuecomment-924033138
Yes, I noticed this has been fixed for some months now. Fossilize doesn't pressure the machine beyond reasonable levels anymore, that behaviour is solved at least on my end.
Yes this is old new now. @kisak-valve any possibility of adding a startup switch if not a setting similar to what I suggested here (could be simply a tickbox for "use the maximum threads to process shader" or anything else that could allow us to make use of our CPU, especially when Steam process shaders most of the time when we start it, because of another bug).
I don't give up and believe at some point we'll have a startup option to be able to set the number of threads to use for Steam instead to have to open the console and type commands.
This may receive another thought after wide deployment of the Steam Deck I think... Currently, the focus is on battery efficiency and then it makes sense to only bump up one single or two cores. I don't anything will change on this before a few months have passed.
And after all, I don't think it provides any benefit of bumping up the background threads: Except for the re-processing bug, this is a one time task done in the background - processing time doesn't matter. And foreground mode will already use all threads, and then fossilize would be done for that game. You're asking for a feature that's not really that useful in a wider range of installations, especially given that people have Steam installed on battery-driven devices (laptops, Steam Deck). I'd rather prefer a running Steam Client process shaders slowly in the background than depleting the battery just because I didn't close it. The choice would thus be: Slow processing, or no processing at all.
I'd rather ask for thread adjustment based on whether the AC mode of the system changes. This could actually benefit battery mode of a system because more work has been processed on AC. This may turn out as a good afterthought for Steam Deck. But I really don't believe bumping up the threads for your desktop system would provide much benefit. But OTOH, my system is turned on most of the time, even when idle. So I may have a different perspective on that.
You'd also have to think of the target audience of the Steam Client, and that is often people who just use the system and don't want to tinker with technical details. Many people probably even don't know what they should do with such a setting, and may use it in a wrong way just because some Google result said it makes things magically "faster". So Valve should consider how to make this a more dynamic and automatic setting, without providing any GUI to it.
I could imagine it makes sense to put fossilize in a cgroup which limits power usage of the CPU for its threads. I'm not sure how well this power limiter works already in the kernel, or how well it is integrated with cgroups, or if some processors even expose such details. Maybe progress on that feature just depends on that kernel feature?
Nobody spread the information, but the setting can be set permanently with the help of a config file. Create ~/.local/share/Steam/steam_dev.cfg
and add the parameter inside that file unShaderBackgroundProcessingThreads 12
then start Steam, observe the parameter being set when Steam launches and starts processing shaders in the background.
What a relief to not have to do it manually every time.
Is there a way to assign it to specific cores? And more importantly separate from Steam itself. I.e. Steam is at cores 12-15 and fossilize_replay at cores 0-11? I have a script to do that when I notice it running, but that's far from ideal. Currently it takes the same affinity Steam has.
Having it in tools list with ability to adjust launch options would be great.
Maybe Steam could start those sub-processes in a systemd user slice, similar to how modern desktop environments start applications. This would allow assigning cores or limiting CPU impact via a systemd config drop-in.
@kakra I have affinity set in system.conf for last 4 threads, so Steam and every app/game you run from it follows it, unles launch option with "taskset -c n-n" is specified per app (which I have for every game at 0-11, running it separately from system and especially Steam, drastcally improve 0.1%min fps). Can only hope for Steam wide launch options.
@kassindornelles why this issue was closed?
@kassindornelles why this issue was closed?
3 years and no response, I want to manage actual issues that will lead somewhere when I open my GitHub to see the ones I reported myself, this one is abandoned and nothing will change in regard of the feature request.
Can only hope for Steam wide launch options.
@kndgs Well, this is quite easy to do using a launcher script. I've created /usr/local/bin/gamemode
which sets common set of options and ends with exec "$@"
, and then simply add gamemode %command%
to each game. So I have the option to run with or without these defaults per game and can even add more options like DXVK_HUD=full gamemode %command%
.
But I've found that isolating CPU cores for system/desktop environment and CPU cores for the game might actually worsen IO latency, so I only isolate some system and desktop services to the E cores of my CPU and put them in the same systemd slice (which has AllowedCPUs=16-19
). I'll let the game run on all cores unconditionally and let the scheduler decide.
Would be interesting to be able to select how many CPU cores/threads fossilize can utilize to compile shaders in the background. I have a 4 core CPU and 2 cores would work better for me since it doesn't lags my system while i am browsing with Chrome, right now Steam uses only 1 thread by default on my system.