MichaIng / DietPi

Lightweight justice for your single-board computer!
https://dietpi.com/
GNU General Public License v2.0
4.79k stars 494 forks source link

Softlock CPU Stuck Renders My Device Unresponsive #7183

Open louisefindlay23 opened 1 month ago

louisefindlay23 commented 1 month ago

Creating a bug report/issue

Required Information

Additional Information (if applicable)

Steps to reproduce

  1. Leave the PC running with the display off and eventually the error occurs.

Expected behaviour

Actual behaviour

IMG_1198

MichaIng commented 1 month ago

Please enable persistent system logs to check where those soft locks start:

dietpi-software uninstall 103
mkdir /var/log/journal
reboot

When it happens again, browse the logs from previous boot session:

journalctl
louisefindlay23 commented 1 day ago

Sure, @MichaIng. Here is the log https://paste.debian.net/hidden/262b91b4/

Also included a slightly different picture in case that helps.image

MichaIng commented 1 day ago

@louisefindlay23 The journalctl command should show a long list of lines, including those from the last boot session before lockup, and also those very same watchdog and rcu errors from your screenshot. Can you check and paste again the journalctl lines from before those kernel errors? So we see what might have caused it.

Or did the stall happen right at the time when you started that Docker container with sudo docker compose up -d?

louisefindlay23 commented 1 day ago

@MichaIng, sure. I tried journalctl and also journalctl --list-boots but it only shows one entry even after the issue occurs again, there's no amount of detail for some strange reason. I even tried setting storage to persistent.

The error doesn't occur immediately after the docker start command but some time after they start but never the same amount of the time. Sometimes it's an hour or two or sometimes several days.

louisefindlay23 commented 16 hours ago

@louisefindlay23

The journalctl command should show a long list of lines, including those from the last boot session before lockup, and also those very same watchdog and rcu errors from your screenshot. Can you check and paste again the journalctl lines from before those kernel errors? So we see what might have caused it.

Or did the stall happen right at the time when you started that Docker container with sudo docker compose up -d?

Turns out it was user error. I didn't realise I needed to use sudo. 🤦‍♀️

Hopefully this should help, @MichaIng

https://pastebin.com/7UywwLCA