pop-os / pop

A project for managing all Pop!_OS sources
https://system76.com/pop
2.48k stars 87 forks source link

desktop becomes unresponsive using NVIDIA 410 driver and soft lockups after inactivity for extended periods #422

Open bjb1959 opened 5 years ago

bjb1959 commented 5 years ago

Distribution (run cat /etc/os-release):

Related Application and/or Package Version (run apt policy $PACKAGE NAME):

Issue/Bug Description:

Steps to reproduce (if you know):

Expected behavior:

Other Notes:

bjb1959 commented 5 years ago

I am on a desktop PC and not a laptop. Just thought you should know. love the distro so far but one thing I did have to do. I noticed that since I leave my desktop running all the time that overnight or after a long time of inactivity the desktop would become unresponsive. the mouse pointer would move but I would have to wait several minutes after clicking on an icon for the system to wake up so to speak but it was still kind of sluggish when opening and closing apps. I finally had to disable the system 76 repos and add the NVIDIA repos and install the latest 415 drivers to solve the issues. I needed those drivers anyway since this is a pretty stout gaming system and I run steam linux and windows games like overwatch and needed the latest nvidia driver and wine-staging anyway. Just wanted you to be aware.

bjb1959 commented 5 years ago

An update. in my research I found that on AMD Ryzen based CPU's there is a bug in the C6 power state that causes soft lockups after long periods of inactivity and then moving the mouse or hitting a key etc. after using the instructions I found below I have not had any soft lockups. The latency is still an issue on nvidia 410 driver for some reason but using 415 driver solves those issues. The soft lockup was an even worse issue but these instructions solved that so far. the link this is from is: https://forum.manjaro.org/t/fix-ryzen-lockups-related-to-low-system-usage/39723

@mioc has described a method for disabling the C6 power state reliably in another topic 54. here, i want to amend my own experiences, simplify the process altogether, and create an editable wiki so other people can add their experiences.

there are couple of tips floating around on the internet for fixing Ryzen lockups related to low system usage. typically, your system is no doing much at all (like showing a movie, playing music, showing the same simple website for an extended period of time) and when you want more CPU power, e.g. by moving the mouse, it freezes. the cause of this freeze/crash is a bug in the C6 power/sleep state 46 of first generation Ryzen CPUs.

i have collected and tested these tips in the past:

setting rcu_nocbs=0-11 (for a 12 thread CPU) as your boot parameter in /etc/default/grub. this setting is supposed to disable ASLR, which should decrease the number of times Ryzen CPUs enter C6 sleep state. my system still kept crashing!

setting processor.max_cstate=5 as your boot parameter in /etc/default/grub. this setting is supposed to disable the c6 sleep state altogether, but my system kept crashing. probably this setting was overwritten by another process.

the only method, which works for me (i have already been without crashes for almost a month) is described in the following tutorial. i have only tested it on both kernel 4.14 and 4.15.

load MSR kernel module during boot:
sudo nano /etc/modules-load.d/modules.conf
add the following line and save the file:

    msr

get zenstates from github 36:
cd ~
git clone https://github.com/r4m0n/ZenStates-Linux.git
move zenstates.py to a place you can leave it and forget about it:
sudo cp ZenStates-Linux/zenstates.py /usr/local/bin/

create systemd service:
sudo nano /usr/lib/systemd/system/ryzen-disable-c6.service
enter the following code and save it:

[Unit] Description=Disable C6 power state on Ryzen CPUs DefaultDependencies=no After=sysinit.target local-fs.target Before=basic.target

[Service] Type=oneshot ExecStart=/usr/local/bin/zenstates.py --c6-disable

[Install] WantedBy=basic.target

enable systemd service:
sudo systemctl enable ryzen-disable-c6

delete downloaded folder from github:
cd ~
sudo rm -r ZenStates-Linux

reboot your system

make sure everything has worked:

    check, whether msr kernel module is loaded (the following command should have an output):
    lsmod | grep msr

    check, whether c6 power state is disabled:
    sudo /usr/local/bin/zenstates.py -l
alphastrata commented 5 years ago

An update. in my research I found that on AMD Ryzen based CPU's there is a bug in the C6 power state that causes soft lockups after long periods of inactivity and then moving the mouse or hitting a key etc. after using the instructions I found below I have not had any soft lockups. The latency is still an issue on nvidia 410 driver for some reason but using 415 driver solves those issues. The soft lockup was an even worse issue but these instructions solved that so far. the link this is from is: https://forum.manjaro.org/t/fix-ryzen-lockups-related-to-low-system-usage/39723

@MioC has described a method for disabling the C6 power state reliably in another topic 54. here, i want to amend my own experiences, simplify the process altogether, and create an editable wiki so other people can add their experiences.

there are couple of tips floating around on the internet for fixing Ryzen lockups related to low system usage. typically, your system is no doing much at all (like showing a movie, playing music, showing the same simple website for an extended period of time) and when you want more CPU power, e.g. by moving the mouse, it freezes. the cause of this freeze/crash is a bug in the C6 power/sleep state 46 of first generation Ryzen CPUs.

i have collected and tested these tips in the past:

setting rcu_nocbs=0-11 (for a 12 thread CPU) as your boot parameter in /etc/default/grub. this setting is supposed to disable ASLR, which should decrease the number of times Ryzen CPUs enter C6 sleep state. my system still kept crashing!

setting processor.max_cstate=5 as your boot parameter in /etc/default/grub. this setting is supposed to disable the c6 sleep state altogether, but my system kept crashing. probably this setting was overwritten by another process.

the only method, which works for me (i have already been without crashes for almost a month) is described in the following tutorial. i have only tested it on both kernel 4.14 and 4.15.

load MSR kernel module during boot:
sudo nano /etc/modules-load.d/modules.conf
add the following line and save the file:

    msr

get zenstates from github 36:
cd ~
git clone https://github.com/r4m0n/ZenStates-Linux.git
move zenstates.py to a place you can leave it and forget about it:
sudo cp ZenStates-Linux/zenstates.py /usr/local/bin/

create systemd service:
sudo nano /usr/lib/systemd/system/ryzen-disable-c6.service
enter the following code and save it:

[Unit] Description=Disable C6 power state on Ryzen CPUs DefaultDependencies=no After=sysinit.target local-fs.target Before=basic.target

[Service] Type=oneshot ExecStart=/usr/local/bin/zenstates.py --c6-disable

[Install] WantedBy=basic.target

enable systemd service:
sudo systemctl enable ryzen-disable-c6

delete downloaded folder from github:
cd ~
sudo rm -r ZenStates-Linux

reboot your system

make sure everything has worked:

    check, whether msr kernel module is loaded (the following command should have an output):
    lsmod | grep msr

    check, whether c6 power state is disabled:
    sudo /usr/local/bin/zenstates.py -l

So I updated was updating some bios' at work today for Ryzen 3000 compat and the C6 powerstates seem to have been removed, perhaps a bios update is more surefire way of solving this.

bjb1959 commented 5 years ago

Thanks for the reply. I already created a systemctl file to disable c6 powerstate on system boot since my initial post which does solve this issue.

On Tue, Aug 6, 2019 at 9:07 AM Jeremy Francis notifications@github.com wrote:

An update. in my research I found that on AMD Ryzen based CPU's there is a bug in the C6 power state that causes soft lockups after long periods of inactivity and then moving the mouse or hitting a key etc. after using the instructions I found below I have not had any soft lockups. The latency is still an issue on nvidia 410 driver for some reason but using 415 driver solves those issues. The soft lockup was an even worse issue but these instructions solved that so far. the link this is from is: https://forum.manjaro.org/t/fix-ryzen-lockups-related-to-low-system-usage/39723

@MioC https://github.com/MioC has described a method for disabling the C6 power state reliably in another topic 54. here, i want to amend my own experiences, simplify the process altogether, and create an editable wiki so other people can add their experiences.

there are couple of tips floating around on the internet for fixing Ryzen lockups related to low system usage. typically, your system is no doing much at all (like showing a movie, playing music, showing the same simple website for an extended period of time) and when you want more CPU power, e.g. by moving the mouse, it freezes. the cause of this freeze/crash is a bug in the C6 power/sleep state 46 of first generation Ryzen CPUs.

i have collected and tested these tips in the past:

setting rcu_nocbs=0-11 (for a 12 thread CPU) as your boot parameter in /etc/default/grub. this setting is supposed to disable ASLR, which should decrease the number of times Ryzen CPUs enter C6 sleep state. my system still kept crashing!

setting processor.max_cstate=5 as your boot parameter in /etc/default/grub. this setting is supposed to disable the c6 sleep state altogether, but my system kept crashing. probably this setting was overwritten by another process.

the only method, which works for me (i have already been without crashes for almost a month) is described in the following tutorial. i have only tested it on both kernel 4.14 and 4.15.

load MSR kernel module during boot: sudo nano /etc/modules-load.d/modules.conf add the following line and save the file:

msr

get zenstates from github 36: cd ~ git clone https://github.com/r4m0n/ZenStates-Linux.git move zenstates.py to a place you can leave it and forget about it: sudo cp ZenStates-Linux/zenstates.py /usr/local/bin/

create systemd service: sudo nano /usr/lib/systemd/system/ryzen-disable-c6.service enter the following code and save it:

[Unit] Description=Disable C6 power state on Ryzen CPUs DefaultDependencies=no After=sysinit.target local-fs.target Before=basic.target

[Service] Type=oneshot ExecStart=/usr/local/bin/zenstates.py --c6-disable

[Install] WantedBy=basic.target

enable systemd service: sudo systemctl enable ryzen-disable-c6

delete downloaded folder from github: cd ~ sudo rm -r ZenStates-Linux

reboot your system

make sure everything has worked:

check, whether msr kernel module is loaded (the following command should have an output):
lsmod | grep msr

check, whether c6 power state is disabled:
sudo /usr/local/bin/zenstates.py -l

So I updated was updating some bios' at work today for Ryzen 3000 compat and the C6 powerstates seem to have been removed, perhaps a bios update is more surefire way of solving this.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/pop-os/pop/issues/422?email_source=notifications&email_token=ABGFVS4QZAUHCC2W37NOFWDQDGAMNA5CNFSM4GT64DS2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3VH5AY#issuecomment-518684291, or mute the thread https://github.com/notifications/unsubscribe-auth/ABGFVS5OYX5YNHLEM2UJZD3QDGAMNANCNFSM4GT64DSQ .