pop-os / pop

A project for managing all Pop!_OS sources
https://system76.com/pop
2.47k stars 87 forks source link

Complete system freeze #367

Closed yonilerner closed 5 years ago

yonilerner commented 6 years ago

Distribution (run cat /etc/os-release):

NAME="Pop!_OS"
VERSION="18.04 LTS"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Pop!_OS 18.04 LTS"
VERSION_ID="18.04"
HOME_URL="https://system76.com/pop"
SUPPORT_URL="http://support.system76.com"
BUG_REPORT_URL="https://github.com/pop-os/pop/issues"
PRIVACY_POLICY_URL="https://system76.com/privacy"
VERSION_CODENAME=bionic
UBUNTU_CODENAME=bionic

Related Application and/or Package Version (run apt policy $PACKAGE NAME): N/A

Issue/Bug Description: Computer randomly freezes, and when it does, the freeze is complete, not just a GNOME crash - keyboard completely inoperable (caps lock light doesnt show up even), no possible interaction with the computer (eg. tty), no amount of time causes it to unfreeze.

Steps to reproduce (if you know): The issue occurs randomly. However, this will also often happen several times in a row upon reaching the login screen when attempting to recover from the freeze after holding down the power button: the computer will boot normally, Ill enter the disk decrypt key, the login screen will appear, and the moment it does, the entire system freezes. After some number of tries, it will usually eventually allow me into the computer, and work fine from there. But then every now and then it freezes randomly. It happened 4 times today randomly (the first two about 20 minutes apart), and each time it took over 10 tries (reach login screen, freeze, hold power button, reach login screen, freeze, etc.) to finally make it in.

Unfortunately, I cannot reliably reproduce this issue. Near the end of the work day today it froze, so I held the power button and just left. Coming home now to turn it back on, and it booted up & logged in without issue.

Expected behavior: Not freezing

Other Notes: uname -r: 4.15.0-36-generic

Misc info:

Heres the output from syslog immediately before the latest freeze occurred:

Oct  8 19:21:00 pop-os kernel: [17541.323846] [UFW BLOCK] IN=wlp2s0 OUT= MAC=01:00:5e:00:00:fb:9c:e3:3f:6a:52:1e:08:00 SRC=10.0.1.100 DST=224.0.0.251 LEN=32 TOS=0x00 PREC=0x00 TTL=1 ID=65494 PROTO=2 
Oct  8 19:21:24 pop-os gvfsd[1859]: PTP: reading event an error 0x01 occurred
Oct  8 19:21:24 pop-os kernel: [17565.317139] usb 1-2.4: USB disconnect, device number 10
Oct  8 19:21:24 pop-os gvfsd[1859]: message repeated 20 times: [ PTP: reading event an error 0x01 occurred]
Oct  8 19:21:24 pop-os gvfsd[1859]: PTP: reading event an error 0x05 occurred
Oct  8 19:21:24 pop-os upowerd[1293]: unhandled action 'unbind' on /sys/devices/pci0000:00/0000:00:14.0/usb1/1-2/1-2.4/1-2.4:1.0
Oct  8 19:21:24 pop-os upowerd[1293]: unhandled action 'unbind' on /sys/devices/pci0000:00/0000:00:14.0/usb1/1-2/1-2.4
Oct  8 19:21:24 pop-os gnome-shell[1853]: Object Gio.Settings (0x5618f09a2480), has been already deallocated - impossible to access to it. This might be caused by the fact that the object has been destroyed from C code using something such as destroy(), dispose(), or remove() vfuncs
Oct  8 19:21:24 pop-os gnome-shell[1853]: Object Gio.Settings (0x5618da7ac9c0), has been already deallocated - impossible to access to it. This might be caused by the fact that the object has been destroyed from C code using something such as destroy(), dispose(), or remove() vfuncs
Oct  8 19:21:24 pop-os org.gnome.Shell.desktop[1853]: == Stack trace for context 0x5618d7d2a340 ==
Oct  8 19:21:24 pop-os org.gnome.Shell.desktop[1853]: #0 0x5618d81f1bf8 i   resource:///org/gnome/shell/ui/messageTray.js:235 (0x7f241ac9d098 @ 22)
Oct  8 19:21:24 pop-os org.gnome.Shell.desktop[1853]: #1 0x7fff21bf0be0 I   resource:///org/gnome/gjs/modules/_legacy.js:82 (0x7f241aeb5de0 @ 71)
Oct  8 19:21:24 pop-os org.gnome.Shell.desktop[1853]: #2 0x5618d81f1b58 i   resource:///org/gnome/shell/ui/messageTray.js:812 (0x7f241aca0098 @ 28)
Oct  8 19:21:24 pop-os org.gnome.Shell.desktop[1853]: #3 0x7fff21bf17c0 I   resource:///org/gnome/gjs/modules/_legacy.js:82 (0x7f241aeb5de0 @ 71)
Oct  8 19:21:24 pop-os org.gnome.Shell.desktop[1853]: #4 0x5618d81f1ab8 i   /home/yonil/.local/share/gnome-shell/extensions/notifications-alert-on-user-menu@hackedbellini.gmail.com/extension.js:236 (0x7f24180c7230 @ 22)
Oct  8 19:21:24 pop-os org.gnome.Shell.desktop[1853]: #5 0x5618d81f1a28 i   resource:///org/gnome/shell/ui/components/autorunManager.js:291 (0x7f241807b4d8 @ 61)
Oct  8 19:21:24 pop-os org.gnome.Shell.desktop[1853]: #6 0x7fff21bf2720 I   resource:///org/gnome/gjs/modules/_legacy.js:82 (0x7f241aeb5de0 @ 71)
Oct  8 19:21:24 pop-os org.gnome.Shell.desktop[1853]: #7 0x5618d81f19a0 i   resource:///org/gnome/shell/ui/components/autorunManager.js:194 (0x7f241807b120 @ 25)
Oct  8 19:21:24 pop-os org.gnome.Shell.desktop[1853]: #8 0x7fff21bf3310 I   resource:///org/gnome/gjs/modules/_legacy.js:82 (0x7f241aeb5de0 @ 71)
Oct  8 19:21:24 pop-os org.gnome.Shell.desktop[1853]: #9 0x7fff21bf33e0 b   self-hosted:918 (0x7f241aef12b8 @ 394)
Oct  8 19:21:24 pop-os org.gnome.Shell.desktop[1853]: == Stack trace for context 0x5618d7d2a340 ==
Oct  8 19:21:24 pop-os org.gnome.Shell.desktop[1853]: #0 0x5618d81f1bf8 i   resource:///org/gnome/shell/ui/messageTray.js:236 (0x7f241ac9d098 @ 42)
Oct  8 19:21:24 pop-os org.gnome.Shell.desktop[1853]: #1 0x7fff21bf0be0 I   resource:///org/gnome/gjs/modules/_legacy.js:82 (0x7f241aeb5de0 @ 71)
Oct  8 19:21:24 pop-os org.gnome.Shell.desktop[1853]: #2 0x5618d81f1b58 i   resource:///org/gnome/shell/ui/messageTray.js:812 (0x7f241aca0098 @ 28)
Oct  8 19:21:24 pop-os org.gnome.Shell.desktop[1853]: #3 0x7fff21bf17c0 I   resource:///org/gnome/gjs/modules/_legacy.js:82 (0x7f241aeb5de0 @ 71)
Oct  8 19:21:24 pop-os org.gnome.Shell.desktop[1853]: #4 0x5618d81f1ab8 i   /home/yonil/.local/share/gnome-shell/extensions/notifications-alert-on-user-menu@hackedbellini.gmail.com/extension.js:236 (0x7f24180c7230 @ 22)
Oct  8 19:21:24 pop-os org.gnome.Shell.desktop[1853]: #5 0x5618d81f1a28 i   resource:///org/gnome/shell/ui/components/autorunManager.js:291 (0x7f241807b4d8 @ 61)
Oct  8 19:21:24 pop-os org.gnome.Shell.desktop[1853]: #6 0x7fff21bf2720 I   resource:///org/gnome/gjs/modules/_legacy.js:82 (0x7f241aeb5de0 @ 71)
Oct  8 19:21:24 pop-os org.gnome.Shell.desktop[1853]: #7 0x5618d81f19a0 i   resource:///org/gnome/shell/ui/components/autorunManager.js:194 (0x7f241807b120 @ 25)
Oct  8 19:21:24 pop-os org.gnome.Shell.desktop[1853]: #8 0x7fff21bf3310 I   resource:///org/gnome/gjs/modules/_legacy.js:82 (0x7f241aeb5de0 @ 71)
Oct  8 19:21:24 pop-os org.gnome.Shell.desktop[1853]: #9 0x7fff21bf33e0 b   self-hosted:918 (0x7f241aef12b8 @ 394)
Oct  8 19:21:24 pop-os kernel: [17565.454403] [UFW BLOCK] IN=wlp2s0 OUT= MAC= SRC=fe80:0000:0000:0000:b20a:e318:239e:8809 DST=ff02:0000:0000:0000:0000:0000:0000:0001 LEN=64 TC=0 HOPLIMIT=1 FLOWLBL=58203 PROTO=UDP SPT=8612 DPT=8612 LEN=24 
Oct  8 19:21:24 pop-os kernel: [17565.454413] [UFW BLOCK] IN=wlp2s0 OUT= MAC= SRC=fe80:0000:0000:0000:b20a:e318:239e:8809 DST=ff02:0000:0000:0000:0000:0000:0000:0001 LEN=64 TC=0 HOPLIMIT=1 FLOWLBL=627639 PROTO=UDP SPT=8612 DPT=8610 LEN=24 
Oct  8 19:21:24 pop-os kernel: [17565.464497] [UFW BLOCK] IN=wlp2s0 OUT= MAC= SRC=fe80:0000:0000:0000:b20a:e318:239e:8809 DST=ff02:0000:0000:0000:0000:0000:0000:0001 LEN=64 TC=0 HOPLIMIT=1 FLOWLBL=58203 PROTO=UDP SPT=8612 DPT=8612 LEN=24 
Oct  8 19:21:24 pop-os kernel: [17565.464505] [UFW BLOCK] IN=wlp2s0 OUT= MAC= SRC=fe80:0000:0000:0000:b20a:e318:239e:8809 DST=ff02:0000:0000:0000:0000:0000:0000:0001 LEN=64 TC=0 HOPLIMIT=1 FLOWLBL=627639 PROTO=UDP SPT=8612 DPT=8610 LEN=24 
Oct  8 19:21:25 pop-os gvfsd[1859]: Android device detected, assigning default bug flags

EDIT: (I had my phone plugged in at the time of this log, hence the last line about the Android device. My phone was not plugged in during any other freezes)

If theres any other info I can provide that would be useful let me know.

Thanks!

jackpot51 commented 6 years ago

This sounds like a hardware issue

yonilerner commented 6 years ago

@jackpot51 thanks for the input! does something in the logs indicate that? Is there anyway to confirm that hypothesis? It will be hard to convince Dell of a hardware issue when it's running an unsupported OS

jackpot51 commented 6 years ago

Yes, the fact that this issue occurs randomly and leads to a complete system freeze indicates that it is a hardware issue. The logs also do not indicate anything in software that would cause a complete system freeze. My recommendation is to contact your hardware supplier and hopefully they can repair your system.

yonilerner commented 6 years ago

Alright. Im going to keep this open for a bit just in case someone else has something to say. If it happens again today Ill grab some more logs also

mathewjpotter commented 5 years ago

ive noticed the same, random freezing on Dell 9570

Pop OS 18.10 Dell i5 8300H 32GB Ram 256 NVM

yonilerner commented 5 years ago

@mathewjpotter This was actually caused by a hardware failure. Specifically, my SSD wouldnt write sometimes, so many situations that required writing to disk would cause total system failure. I cloned the SSD that shipped with the laptop to a new Samsumg 960 PRO, ran fsck, and laptop has been fine since.

pravinba9495 commented 5 years ago

I have the same issue. CPU: AMD Ryzen 1200 RAM: 16GB Pop OS 18.04.1 LTS

Random freeze, cannot reproduce the issue by myself.

mathewjpotter commented 5 years ago

@mathewjpotter This was actually caused by a hardware failure. Specifically, my SSD wouldnt write sometimes, so many situations that required writing to disk would cause total system failure. I cloned the SSD that shipped with the laptop to a new Samsumg 960 PRO, ran fsck, and laptop has been fine since.

Thanks for this info, you were correct

Mathew

mmstick commented 5 years ago

It's always a good idea to perform some hardware tests when issues like this occur.

@pravinba9495 I've heard of the Ryzen 1xxx series having this sort of issue. May want to look for firmware updates and try running some CPU and memory stress tests. Phoronix covered the issue quite a bit when the Ryzen's were new, since the Phoronox Test Suite was encountering regular lockups on them. I believe it happened most frequently when compiling software with GCC, such as the Linux kernel.

mdp18 commented 5 years ago

Having the same issue but with virtualbox. Not sure how that works out.

Thawness commented 5 years ago

I've faced this too on HP 15 Ryzen 3 Laptop. Complete freeze. Only turning off by power button is the solution.

pravinba9495 commented 5 years ago

One fix that I found working, especially on Ryzen systems, is to toggle the Power Supply Idle Control to 'Typical' or disable that setting completely. This has nothing to do with Pop OS. This issue occurs in Ubuntu too for me. This is the solution that works for me.

yorjaggy commented 4 years ago

Same here. My var/sys/log is here. I also have amd (ryzen 1600), GPU AMD RX 480, pop-os 20.04 LTS

Jun  9 18:27:53 moneymachine gnome-shell[3530]: == Stack trace for context 0x5645808bd720 ==
Jun  9 18:27:53 moneymachine gnome-shell[3530]: #0   7fff3fdc3200 b   resource:///org/gnome/shell/ui/workspace.js:727 (34e4ffc7ea60 @ 15)
Jun  9 18:27:53 moneymachine gnome-shell[3530]: #1   7fff3fdc32c0 b   self-hosted:1007 (2e0a1c99bf10 @ 398)
Jun  9 19:06:48 moneymachine systemd-modules-load[547]: Inserted module 'lp'
Jun  9 19:06:48 moneymachine kernel: [    0.000000] Linux version 5.4.0-7629-generic (buildd@lcy01-amd64-013) (gcc version 9.3.0 (Ubuntu 9.3.0-10ubuntu2)) #33 1589834512 20.04 ff6e79e-Ubuntu SMP Mon May 18 23:29:32 UTC  (Ubuntu 5.4.0-7629.33~1589834512 20.04 ff6e79e-generic 5.4.30)
Jun  9 19:06:48 moneymachine mtp-probe: checking bus 1, device 2: "/sys/devices/pci0000:00/0000:00:01.3/0000:03:00.0/usb1/1-8"
Jun  9 19:06:48 moneymachine kernel: [    0.000000] Command line: \\boot\vmlinuz-5.4.0-7629-generic root=UUID=5c5078ec-0d65-4e90-ba84-2f9f80ea970e ro quiet loglevel=0 systemd.show_status=false splash initrd=boot\initrd.img-5.4.0-7629-generic
christophebe commented 4 years ago

Same issue here. It worked fine with Ubuntu 19 for a long time. What I have to check in order to fin the origin of the problem ?

Thanks

DoZh commented 4 years ago

Same issue for me. it's running on ESXi, with allocated 4g ram and 8 cpu core on 4790hq. as your description, issue occur when complie linux kernel with arm-gcc. in esxi monitior, I can see the os read disk with maxium speed for a long time. before it stop, the whole system keep hang, include tty and ssh.

Yuri6037 commented 4 years ago

This just happened to me today at 6PM. I tried to commit to git, heard my HDD for a second and then complete entire system lockup, nothing worked except motherboard reset switch...

BlueSlimee commented 4 years ago

This keeps happening to me and it's completely random. It's bad, the system becomes fully unresponsive (you need to force shutdown your computer) and if there's audio playing, the last second keeps playing over and over again (sounds like a broken CD-ROM). I don't think this is a hardware issue, since a lot of people are having the same problem and this doesn't happen with other operating systems, such as Windows or Manjaro.

I'm using Pop!_OS 20.04 with Linux 5.4.0-7642-generic. My computer is a laptop with an i5-7500u, 8GB RAM and integrated graphics. Any help would be appreciated!

It seems to happen when the system is under pressure. Everytime the system died, I was doing some CPU intensive task, such as compiling something or using Wine.

TekGadgt commented 4 years ago

This keeps happening to me and it's completely random. It's bad, the system becomes fully unresponsive (you need to force shutdown your computer) and if there's audio playing, the last second keeps playing over and over again (sounds like a broken CD-ROM). I don't think this is a hardware issue, since a lot of people are having the same problem and this doesn't happen with other operating systems, such as Windows or Manjaro.

I'm using Pop!_OS 20.04 with Linux 5.4.0-7642-generic. My computer is a laptop with an i5-7500u, 8GB RAM and integrated graphics. Any help would be appreciated!

It seems to happen when the system is under pressure. Everytime the system died, I was doing some CPU intensive task, such as compiling something or using Wine.

Same issue here. Pop!_OS 20.04 with 5.4.0-7642-generic. MSI GE66 i7-10750H, 32GB RAM, 2070 Super on driver version 455.23.04.

BlueSlimee commented 4 years ago

By the way, I tried updating to Linux 5.8.12 and it didn't solve anything, so it's not a kernel issue probably.

NicholasMamo commented 4 years ago

The described issues seem very similar to this: https://github.com/pop-os/pop/issues/1172

josh231101 commented 3 years ago

This happens to me too, is there a page that holds different ways to solve this problem?

ipaqmaster commented 3 years ago

Experiencing this on my new darp7.

Happy with the specs but have run into some issues thus far. Gave it a fresh install but sometimes the hardware just hangs.. keyboard completely unresponsive and device no longer pingable on the LAN when was seconds earlier. Seems to eventually reboot itself if left alone for some time.

Fan seems to go 100% speed during the hang which consistently helps me catch it hanging if I hear it from the other room when the laptop should be idle. Have to hold the power button to force restart it if I recover myself. It may be useful to know this hard-hang hasn't happened with the machine in my hands yet. Always when I'm not looking at it/while idle.

Just in case it matters: Arch with kernel 5.10.11, lightdm+cinnamon for a graphical environment. intel-media-driver package so the GPU doesn't crash and I have system76-power plus system76-dkms as I've been troubleshooting a different issue with the keyboard. The machine boots into a ZFS root on its SSD.

josh231101 commented 3 years ago

Experiencing this on my new darp7.

Happy with the specs but have run into some issues thus far. Gave it a fresh install but sometimes the hardware just hangs.. keyboard completely unresponsive and device no longer pingable on the LAN when was seconds earlier. Seems to eventually reboot itself if left alone for some time.

Fan seems to go 100% speed during the hang which consistently helps me catch it hanging if I hear it from the other room when the laptop should be idle. Have to hold the power button to force restart it if I recover myself. It may be useful to know this hard-hang hasn't happened with the machine in my hands yet. Always when I'm not looking at it/while idle.

Just in case it matters: Arch with kernel 5.10.11, lightdm+cinnamon for a graphical environment. intel-media-driver package so the GPU doesn't crash and I have system76-power plus system76-dkms as I've been troubleshooting a different issue with the keyboard. The machine boots into a ZFS root on its SSD.

HI there, what helped a lot was disabling all of the gnome extension, I just left my beautiful dock and ubuntu appindicators. That really help me, i still have some links that helps give me 5

josh231101 commented 3 years ago

Possible solutions https://www.reddit.com/r/pop_os/comments/ivgeyd/pop_os_keeps_freezing/ https://www.reddit.com/r/pop_os/comments/ixqjel/pop_os_freezes/

ipaqmaster commented 3 years ago

I'm not sure the issue I'm experiencing is as simple as changing the userspace environment, it'll even crash on the lightdm login screen (Before any window manager gets a chance to wake up) some days also. Whether you're using the machine as it hangs or while its idle.

During these crashes the entire machine hangs on the last frame it drew to the display and is completely unresponsive, or sometimes it'll be a green corrupted salad of video memory and in both cases it'll eventually reboot itself.

In some cases it has played audio out the speaker, not from a recent buffer or anything you've heard recently, just static noise or quick higher-pitched screeching noise.

I cannot force it to happen with the stress --cpu 12 command against the core and doing unfair blender render tests doesn't upset the GPU either.. my laptop is fine to do intensive work. The crashes just seem random, with terminals open and maybe a google search tab.

System 76 have offered to RMA my darp7 in hearing my experience so I'm hoping the replacement doesn't do this.

josh231101 commented 3 years ago

Ok so maybe you'll have to delete something about nvidia graphics(?) in my case the file was not in my popos system but search in google: Pop os crashes remove nvidia drivers

ipaqmaster commented 3 years ago

Running Archlinux with kernel 5.10.9 (now) as a brand new fresh install with no trace of nvidia drivers. Tried playing around with xf86-video-intel, intel-graphics-driver and other solutions to no avail making me think it's a power issue rather than anything to do with the intel graphics processor. But doesn't seem to change with pop-os, Ubuntu or Arch. I might have to sit tight for a replacement.

Tried adding intel_idle.max_cstate=1 to my kernel boot parameters which is far from ideal and the laptop has been up for an hour without crashing but that could just be luck. Will continue monitoring.

E: Yeah it didn't crash all day with cstates disabled (=1) but naturally battery life is awful without cstates. I set it back to '9' on the next boot and crashed in 15 minutes, not sure what to do with that information.

josh231101 commented 3 years ago

Guys, after a lot of research I've found that using 4GB of RAM was my problem, at some point the swap memory starts and after using 1gb my HDD was super slow and somehow caused that weird freezing that takes like 5-10 minutes to keep running the OS, then I upgrade to an SSD freezing stopped! Only happened once but last like 5 seconds. FInally I upgrade my laptop to 8GB ram, everything is perfect now!

josh231101 commented 3 years ago

List of different solutions: Possible solutions https://www.reddit.com/r/pop_os/comments/ivgeyd/pop_os_keeps_freezing/ https://www.reddit.com/r/pop_os/comments/ixqjel/pop_os_freezes/ Disable gnome extensions too Disable some apps when the OS starts.

Yuri6037 commented 3 years ago

Guys, after a lot of research I've found that using 4GB of RAM was my problem, at some point the swap memory starts and after using 1gb my HDD was super slow and somehow caused that weird freezing that takes like 5-10 minutes to keep running the OS, then I upgrade to an SSD freezing stopped! Only happened once but last like 5 seconds. FInally I upgrade my laptop to 8GB ram, everything is perfect now!

In my case it's clearly not the problem: 24Gb of RAM here... Sounds weird that PopOS finds a way to pump over 24Gb of RAM considering the few applications I have installed. The heavioust apps I have are wine with games I experiment trying to run them on unsupported systems (those games want windows, I give them Linux instead and as a result some of them blows up in interesting ways) and the programming stuff (mainly 3D rendering, machine learning and application development)...

tati-frog commented 2 years ago

Hi, I'm also having this problem. I've recently installed PopOS, but whenever I do some heavy load on the filesystem, like for example, when I install big packages with the package manager or when I copy or read a lot of files, the system just hangs for some seconds, and then it happens again randomly. Hardware is not a problem for me, this is a fresh install from some hours ago, and I came from fedora and didn't have any problem, I have 16GB of ram and there is no swap usage.