Open braddanford opened 3 months ago
@SteamDeckHomebrew/decky-testers I need folks to attempt to replicate this, the more people attempting to replicate the better.
Can confirm. The delay is due to Decky loader taking too long to unload itself. Confirmed by manually stopping it before shutdown and seeing that the Steam Deck restart delay is gone. Also, stopping Decky delays significantly.
(deck@stoneship ~)$ time sudo systemctl stop plugin_loader
real 0m45.166s
user 0m0.020s
sys 0m0.012s
(deck@stoneship ~)$ sudo systemctl status plugin_loader
× plugin_loader.service - SteamDeck Plugin Loader
Loaded: loaded (/etc/systemd/system/plugin_loader.service; enabled; preset: disabled)
Active: failed (Result: timeout) since Thu 2024-08-15 04:31:24 EEST; 6s ago
Duration: 1min 18.211s
Process: 773 ExecStart=/home/deck/homebrew/services/PluginLoader (code=killed, signal=KILL)
Main PID: 773 (code=killed, signal=KILL)
CPU: 31.536s
I have been experiencing this for days during my plugin testing, so it's been here for a while.
journalctl -b0 -u plugin_loader
reveals nothing (notice the 45sec delay between first and second line):
Aug 15 04:30:39 stoneship systemd[1]: Stopping SteamDeck Plugin Loader...
Aug 15 04:31:24 stoneship systemd[1]: plugin_loader.service: State 'stop-sigterm' timed out. Killing.
Aug 15 04:31:24 stoneship systemd[1]: plugin_loader.service: Killing process 773 (PluginLoader) with signal SIGKILL.
Aug 15 04:31:24 stoneship systemd[1]: plugin_loader.service: Killing process 931 (PluginLoader) with signal SIGKILL.
Aug 15 04:31:24 stoneship systemd[1]: plugin_loader.service: Killing process 1032 (PluginLoader) with signal SIGKILL.
// long list of killed processes omitted
Aug 15 04:31:24 stoneship systemd[1]: plugin_loader.service: Killing process 2087 (PluginLoader) with signal SIGKILL.
Aug 15 04:31:24 stoneship systemd[1]: plugin_loader.service: Killing process 2384 (PluginLoader) with signal SIGKILL.
Aug 15 04:31:24 stoneship systemd[1]: plugin_loader.service: Killing process 6646 (PluginLoader) with signal SIGKILL.
Aug 15 04:31:24 stoneship systemd[1]: plugin_loader.service: Main process exited, code=killed, status=9/KILL
Aug 15 04:31:24 stoneship systemd[1]: plugin_loader.service: Failed with result 'timeout'.
Aug 15 04:31:24 stoneship systemd[1]: Stopped SteamDeck Plugin Loader.
Aug 15 04:31:24 stoneship systemd[1]: plugin_loader.service: Consumed 31.536s CPU time.
Logs of previous run attached as screenshots.
[!NOTE] Everything below is deprecated, was written while I was investigating, ignore unless you are a Decky dev and think you might catch something I missed.
It also happens when you trigger an error screen for ANY plugin, Decky itself, or even the new Valve bug on screenshot deletion, and you click the "Restart Decky" button. It will say "Restarting Decky...", "Reloading UI..." and then stay there for several dozen seconds, while you can navigate away to other menus, until it eventually does restart and reloads the Steam UI.
It also happens when Decky reloads itself after you update it in Game mode (new prerelease, or by changing the channel). It reaches the end of the update process, says Restarting decky or similar and again you have about a minute to wander around until it actually restarts.
Generally, the long delay will happen even if you manually execute
systemctl restart plugin_loader
orsystemctl stop plugin_loader
, without any output in the Terminal, just a long wait.
During this dead time, the service isn't properly responding, evidence by plugins still displaying their frontend, but erroring out when you try to do anything (or as soon as you open them depending on the plugin function).This needs to be confirmed, I might have broken my Decky due to GitHub downtime.Note: I am not sure if it happens when you trigger a Decky restart by quickly toggling "Enable React DevTools" on and off, in that case the UI does go away at the 5 second countdown, but you still have to wait some. Oh, on 2.12.3 it stays black forever. /shrug
discovered this recently as a workaround, the timeout is set to 45 seconds by default. Changing it to 2s resolved the issue for me:
sudo sed -i 's~TimeoutStopSec=.*$~TimeoutStopSec=2~g' /etc/systemd/system/plugin_loader.service
sudo systemctl daemon-reload
Ok, so we know why it's a 45 seconds delay specifically. The question still remains on why it needs to timeout in order to stop and not do it faster than that.
Is there something specific I can look for to help troubleshoot this?
Decky as far as i know doesn't capture any signals from the os, we probably should capture the sigint signal and gracefully stop decky when that goes off
@milroneth looking at the codebase, this was already implemented, huh.
Which plugins do you have installed? It's possible one is stalling the max wait time (40s) until shutdown. Logs for shutdown start at 'Shutting down...', and 'Stopping plugin {x}' (Note: This fix only exists on latest pre)
(Note: i'm currently in the train, i'll check later on deck but i don't remember decky shutdown taking too long)
Edit: On my laptop, this works fine for at least cssloader. Maybe we should be more agressive in stopping plugins?
I have A LOT of plugins installed to be honest, but OP said they even completely reset everything and still got this issue with any plugin installed. Though I guess we don't know what any plugin they chose, it might have been something specific that causes this.
Here's a log on v3.0.0-pre10 after running sudo systemctl stop plugin_loader
Stopping SteamDeck Plugin Loader LOG.txt
Is the log of any use? Do I understand correctly that it's trying to re-load certain plugins instead of unloading them?
That indeed looks very weird, and not like the behaviour i posted. I'll see if i can repro it later.
OP may be using stable decky, which does not have this fix yet.
Either way, this seems broken right now, will look into it
I don't know if my problem is related, but when shutting down I also have to wait for the plugins to unload. Now I saw when I manually stop it, that two processes are still running:
19:02:39 ~ $ sudo systemctl status plugin_loader.service
× plugin_loader.service - SteamDeck Plugin Loader
Loaded: loaded (/etc/systemd/system/plugin_loader.service; enabled; preset: disabled)
Active: failed (Result: timeout) since Tue 2024-10-22 19:02:39 CEST; 5s ago
Duration: 37min 21.165s
Invocation: 2b1d4a005ede43f8868fc298a7e3f020
Process: 973 ExecStart=/home/steph/homebrew/services/PluginLoader (code=killed, signal=KILL)
Main PID: 973 (code=killed, signal=KILL)
Tasks: 8 (limit: 30938)
Memory: 142.3M (peak: 167.4M)
CPU: 2min 52.658s
CGroup: /system.slice/plugin_loader.service
├─1009 "Decky Loader v3.0.1 (/home/steph/homebrew/services/PluginLoader)"
└─1075 "SimpleDeckyTDP (/home/steph/homebrew/plugins/SimpleDeckyTDP/main.py)"
Oct 22 18:25:33 X1 PluginLoader[1009]: [wsrouter][WARNING]: Dropping message as there is no connected socket: {'type': 3, 'event': 'loader/notify_updates', 'args': ()}
Oct 22 19:02:24 X1 systemd[1]: Stopping SteamDeck Plugin Loader...
Oct 22 19:02:39 X1 systemd[1]: plugin_loader.service: State 'stop-sigterm' timed out. Killing.
Oct 22 19:02:39 X1 systemd[1]: plugin_loader.service: Killing process 973 (PluginLoader) with signal SIGKILL.
Oct 22 19:02:39 X1 systemd[1]: plugin_loader.service: Main process exited, code=killed, status=9/KILL
Oct 22 19:02:39 X1 systemd[1]: plugin_loader.service: Failed with result 'timeout'.
Oct 22 19:02:39 X1 systemd[1]: plugin_loader.service: Unit process 1009 (Decky Loader) remains running after unit stopped.
Oct 22 19:02:39 X1 systemd[1]: plugin_loader.service: Unit process 1075 (SimpleDeckyTDP) remains running after unit stopped.
Oct 22 19:02:39 X1 systemd[1]: Stopped SteamDeck Plugin Loader.
Oct 22 19:02:39 X1 systemd[1]: plugin_loader.service: Consumed 2min 47.046s CPU time, 167.4M memory peak.
I am on a Onexplayer X1 though and not on the steam deck....
Please confirm
Bug Report Description
Expected Behaviour
When either shutting down or rebooting the Steamdeck, the hardware should shutdown within 0 to 5 seconds
SteamOS version
SteamOS 3.5.19
Selected Update Channel
Stable
Have you modified the read-only filesystem at any point?
No
Backend Logs
deckylog.txt
Frontend Logs
cef_log.txt cef_log.previous.txt