Open Dark-Matter7232 opened 2 years ago
Faced another system freeze, audio playback was still working and cursor was movable.
Trimmed log:
Jan 03 11:21:04 pop-os ModemManager[982]: <info> [base-manager] couldn't check support for device '/sys/devices/pci0000:00/0000:00:01.2/0000:02:00.0/usb1/1-2': not supported by any plugin
Jan 03 11:21:05 pop-os gvfsd[133497]: Error 1: Get Storage information failed.
Jan 03 11:21:05 pop-os dbus-daemon[1100]: [session uid=1000 pid=1100] Activating service name='org.gnome.Shell.HotplugSniffer' requested by ':1.33' (uid=1000 pid=1280 comm="/usr/bin/gnome-shell ")
Jan 03 11:21:05 pop-os dbus-daemon[1100]: [session uid=1000 pid=1100] Successfully activated service 'org.gnome.Shell.HotplugSniffer'
Jan 03 11:21:17 pop-os gnome-shell[1280]: JS ERROR: Error: Expected an object of type ClutterActor for argument 'sibling' but got type undefined
_syncStacking@resource:///org/gnome/shell/ui/workspaceAnimation.js:80:18
Jan 03 11:21:17 pop-os gnome-shell[1280]: JS ERROR: Error: Expected an object of type ClutterActor for argument 'sibling' but got type undefined
_syncStacking@resource:///org/gnome/shell/ui/workspaceAnimation.js:80:18
Jan 03 11:21:20 pop-os gnome-shell[7093]: [7094:7094:0103/112120.755463:ERROR:brave_new_tab_message_handler.cc(195)] Ads service is not initialized!
Jan 03 11:21:20 pop-os gnome-shell[1280]: Can't update stage views actor MetaWindowGroup is on because it needs an allocation.
Jan 03 11:21:20 pop-os gnome-shell[1280]: Can't update stage views actor MetaWindowActorX11 is on because it needs an allocation.
Jan 03 11:21:20 pop-os gnome-shell[1280]: Can't update stage views actor MetaSurfaceActorX11 is on because it needs an allocation.
Jan 03 11:21:20 pop-os gnome-shell[1280]: Can't update stage views actor MetaWindowActorX11 is on because it needs an allocation.
Jan 03 11:21:20 pop-os gnome-shell[1280]: Can't update stage views actor MetaSurfaceActorX11 is on because it needs an allocation.
Jan 03 11:21:20 pop-os gnome-shell[7093]: [7094:7094:0103/112120.965013:ERROR:CONSOLE(0)] "Unchecked runtime.lastError: Not available in Tor/incognito/guest profile", source: chrome://newtab/ (0)
Jan 03 11:21:20 pop-os gnome-shell[7093]: [7094:7094:0103/112120.965965:ERROR:CONSOLE(0)] "Unchecked runtime.lastError: Not available in Tor/incognito/guest profile", source: chrome://newtab/ (0)
Jan 03 11:21:56 pop-os kernel: BUG: unable to handle page fault for address: fffffffffffffff8
Jan 03 11:21:56 pop-os kernel: #PF: supervisor read access in kernel mode
Jan 03 11:21:56 pop-os kernel: #PF: error_code(0x0000) - not-present page
Jan 03 11:21:56 pop-os kernel: PGD 1ce415067 P4D 1ce415067 PUD 1ce417067 PMD 0
Jan 03 11:21:56 pop-os kernel: Oops: 0000 [#1] SMP NOPTI
Jan 03 11:21:56 pop-os kernel: CPU: 1 PID: 250 Comm: uvd Tainted: G C OE 5.15.12-xanmod1 #0~git20211229.8293471
Jan 03 11:21:56 pop-os kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./A320M-HDV R4.0, BIOS P2.30 06/26/2019
Jan 03 11:21:56 pop-os kernel: RIP: 0010:swake_up_locked+0x12/0x40
Jan 03 11:21:56 pop-os kernel: Code: 10 48 89 02 eb 83 f6 80 f9 07 00 00 01 0f 84 5a ff ff ff eb ad 0f 1f 00 48 8b 57 08 48 8d 47 08 48 39 c2 74 25 53 48 8b 5f 08 <48> 8b 7b f8 e8 05 3e fe ff 48 8b 13 48 8b 43 08 48 89 42 08 48 89
Jan 03 11:21:56 pop-os kernel: RSP: 0018:ffffae3a8111fe80 EFLAGS: 00010007
Jan 03 11:21:56 pop-os kernel: RAX: ffff948bad7018b0 RBX: 0000000000000000 RCX: 00000001004fea90
Jan 03 11:21:56 pop-os kernel: RDX: 0000000000000000 RSI: ffff9489f25eaa30 RDI: ffff948bad7018a8
Jan 03 11:21:56 pop-os kernel: RBP: ffff948bad7018a8 R08: 0000000000000001 R09: 0000000000000052
Jan 03 11:21:56 pop-os kernel: R10: ffff948ad6d97000 R11: ffff948ad6d97000 R12: 0000000000000286
Jan 03 11:21:56 pop-os kernel: R13: ffff948ad7c4ecc8 R14: ffff948bad7018a0 R15: ffff948acf3f4e40
Jan 03 11:21:56 pop-os kernel: FS: 0000000000000000(0000) GS:ffff948dcea40000(0000) knlGS:0000000000000000
Jan 03 11:21:56 pop-os kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 03 11:21:56 pop-os kernel: CR2: fffffffffffffff8 CR3: 000000010e23c000 CR4: 00000000003506e0
Jan 03 11:21:56 pop-os kernel: Call Trace:
Jan 03 11:21:56 pop-os kernel: <TASK>
Jan 03 11:21:56 pop-os kernel: complete+0x2a/0x40
Jan 03 11:21:56 pop-os kernel: drm_sched_main+0x1ab/0x3d0 [gpu_sched]
Jan 03 11:21:56 pop-os kernel: ? __wake_up_pollfree+0x30/0x30
Jan 03 11:21:56 pop-os kernel: ? drm_sched_select_entity+0xc0/0xc0 [gpu_sched]
Jan 03 11:21:56 pop-os kernel: kthread+0x11f/0x140
Jan 03 11:21:56 pop-os kernel: ? set_kthread_struct+0x30/0x30
Jan 03 11:21:56 pop-os kernel: ret_from_fork+0x1f/0x30
Jan 03 11:21:56 pop-os kernel: </TASK>
Jan 03 11:21:56 pop-os kernel: Modules linked in: ses enclosure scsi_transport_sas uas usb_storage cdc_acm ntfs3 cfg80211 ntfs snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi ledtrig_audio nls_iso8859_1 snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi intel_rapl_msr snd_hda_codec intel_rapl_common snd_hda_core snd_hwdep snd_pcm edac_mce_amd snd_seq_midi snd_seq_midi_event snd_rawmidi kvm_amd kvm r8188eu(C) snd_seq rapl joydev input_leds snd_seq_device snd_timer efi_pstore wmi_bmof snd ccp k10temp soundcore mac_hid sch_fq_codel msr parport_pc ppdev lp parport binfmt_misc ip_tables x_tables autofs4 dm_crypt raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear system76_io(OE) system76_acpi(OE) hid_generic usbhid hid amdgpu r8169 realtek mdio_devres crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd iommu_v2 gpu_sched i2c_algo_bit drm_ttm_helper ttm drm_kms_helper syscopyarea sysfillrect
Jan 03 11:21:56 pop-os kernel: sysimgblt fb_sys_fops cec rc_core libphy ahci xhci_pci drm libahci xhci_pci_renesas wmi i2c_piix4 video gpio_amdpt gpio_generic
Jan 03 11:21:56 pop-os kernel: CR2: fffffffffffffff8
Jan 03 11:21:56 pop-os kernel: ---[ end trace a8b2cf5824589357 ]---
Jan 03 11:21:56 pop-os kernel: [drm] Fence fallback timer expired on ring gfx
Jan 03 11:21:56 pop-os kernel: RIP: 0010:swake_up_locked+0x12/0x40
Jan 03 11:21:56 pop-os kernel: Code: 10 48 89 02 eb 83 f6 80 f9 07 00 00 01 0f 84 5a ff ff ff eb ad 0f 1f 00 48 8b 57 08 48 8d 47 08 48 39 c2 74 25 53 48 8b 5f 08 <48> 8b 7b f8 e8 05 3e fe ff 48 8b 13 48 8b 43 08 48 89 42 08 48 89
Jan 03 11:21:56 pop-os kernel: RSP: 0018:ffffae3a8111fe80 EFLAGS: 00010007
Jan 03 11:21:56 pop-os kernel: RAX: ffff948bad7018b0 RBX: 0000000000000000 RCX: 00000001004fea90
Jan 03 11:21:56 pop-os kernel: RDX: 0000000000000000 RSI: ffff9489f25eaa30 RDI: ffff948bad7018a8
Jan 03 11:21:56 pop-os kernel: RBP: ffff948bad7018a8 R08: 0000000000000001 R09: 0000000000000052
Jan 03 11:21:56 pop-os kernel: R10: ffff948ad6d97000 R11: ffff948ad6d97000 R12: 0000000000000286
Jan 03 11:21:56 pop-os kernel: R13: ffff948ad7c4ecc8 R14: ffff948bad7018a0 R15: ffff948acf3f4e40
Jan 03 11:21:56 pop-os kernel: FS: 0000000000000000(0000) GS:ffff948dcea40000(0000) knlGS:0000000000000000
Jan 03 11:21:56 pop-os kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 03 11:21:56 pop-os kernel: CR2: fffffffffffffff8 CR3: 000000010e23c000 CR4: 00000000003506e0
Jan 03 11:21:56 pop-os kernel: sched: RT throttling activated
Jan 03 11:21:56 pop-os kernel: BUG: kernel NULL pointer dereference, address: 0000000000000259
Jan 03 11:21:56 pop-os kernel: #PF: supervisor read access in kernel mode
Jan 03 11:21:56 pop-os kernel: #PF: error_code(0x0000) - not-present page
Jan 03 11:21:56 pop-os kernel: PGD 0 P4D 0
Jan 03 11:21:56 pop-os kernel: Oops: 0000 [#2] SMP NOPTI
Jan 03 11:21:56 pop-os kernel: CPU: 1 PID: 80095 Comm: kworker/u64:1 Tainted: G D C OE 5.15.12-xanmod1 #0~git20211229.8293471
Jan 03 11:21:56 pop-os kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./A320M-HDV R4.0, BIOS P2.30 06/26/2019
Jan 03 11:21:56 pop-os kernel: Workqueue: events_unbound commit_work [drm_kms_helper]
Jan 03 11:21:56 pop-os kernel: RIP: 0010:drm_atomic_helper_cleanup_planes+0x2c/0x60 [drm_kms_helper]
Jan 03 11:21:56 pop-os kernel: Code: 56 08 8b 82 b8 02 00 00 85 c0 7e 51 55 48 89 f5 53 31 db 48 63 c3 48 c1 e0 05 48 03 45 18 48 8b 38 48 85 ff 74 29 48 8b 70 10 <48> 39 b7 58 02 00 00 48 0f 44 70 18 48 8b 87 50 02 00 00 48 8b 40
Jan 03 11:21:56 pop-os kernel: RSP: 0018:ffffae3a8e7bfb38 EFLAGS: 00010202
Jan 03 11:21:56 pop-os kernel: RAX: ffff948bad7018a0 RBX: 0000000000000005 RCX: ffff948ad7c45b01
Jan 03 11:21:56 pop-os kernel: RDX: ffff948ad7c40010 RSI: 0000000000000000 RDI: 0000000000000001
Jan 03 11:21:56 pop-os kernel: RBP: ffff948acc6df500 R08: ffff948accf8fdb8 R09: 0000000000000001
Jan 03 11:21:56 pop-os kernel: R10: 000000000000000c R11: 000000000000022c R12: 0000000000000246
Jan 03 11:21:56 pop-os kernel: R13: ffff948ad7c40170 R14: ffff948ad7c40010 R15: ffff948acc6df500
Jan 03 11:21:56 pop-os kernel: FS: 0000000000000000(0000) GS:ffff948dcea40000(0000) knlGS:0000000000000000
Jan 03 11:21:56 pop-os kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 03 11:21:56 pop-os kernel: CR2: 0000000000000259 CR3: 000000010d05a000 CR4: 00000000003506e0
Jan 03 11:21:56 pop-os kernel: Call Trace:
Jan 03 11:21:56 pop-os kernel: <TASK>
Jan 03 11:21:56 pop-os kernel: amdgpu_dm_atomic_commit_tail+0x19f8/0x25c0 [amdgpu]
Jan 03 11:21:56 pop-os kernel: ? try_to_wake_up+0x1a7/0x430
Jan 03 11:21:56 pop-os kernel: ? __ext4_handle_dirty_metadata+0x58/0x1a0
Jan 03 11:21:56 pop-os kernel: ? lock_timer_base+0x5c/0x80
Jan 03 11:21:56 pop-os kernel: ? __mod_timer+0x20f/0x3b0
Jan 03 11:21:56 pop-os kernel: ? update_load_avg+0x7a/0x530
Jan 03 11:21:56 pop-os kernel: ? newidle_balance+0x11b/0x3f0
Jan 03 11:21:56 pop-os kernel: ? __cond_resched+0x11/0x40
Jan 03 11:21:56 pop-os kernel: ? __wait_for_common+0x3b/0x160
Jan 03 11:21:56 pop-os kernel: ? finish_task_switch.isra.0+0xa2/0x280
Jan 03 11:21:56 pop-os kernel: commit_tail+0x8c/0x120 [drm_kms_helper]
Jan 03 11:21:56 pop-os kernel: process_one_work+0x1f7/0x360
Jan 03 11:21:56 pop-os kernel: worker_thread+0x4b/0x400
Jan 03 11:21:56 pop-os kernel: ? process_one_work+0x360/0x360
Jan 03 11:21:56 pop-os kernel: kthread+0x11f/0x140
Jan 03 11:21:56 pop-os kernel: ? set_kthread_struct+0x30/0x30
Jan 03 11:21:56 pop-os kernel: ret_from_fork+0x1f/0x30
Jan 03 11:21:56 pop-os kernel: </TASK>
Jan 03 11:21:56 pop-os kernel: Modules linked in: ses enclosure scsi_transport_sas uas usb_storage cdc_acm ntfs3 cfg80211 ntfs snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi ledtrig_audio nls_iso8859_1 snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi intel_rapl_msr snd_hda_codec intel_rapl_common snd_hda_core snd_hwdep snd_pcm edac_mce_amd snd_seq_midi snd_seq_midi_event snd_rawmidi kvm_amd kvm r8188eu(C) snd_seq rapl joydev input_leds snd_seq_device snd_timer efi_pstore wmi_bmof snd ccp k10temp soundcore mac_hid sch_fq_codel msr parport_pc ppdev lp parport binfmt_misc ip_tables x_tables autofs4 dm_crypt raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear system76_io(OE) system76_acpi(OE) hid_generic usbhid hid amdgpu r8169 realtek mdio_devres crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd iommu_v2 gpu_sched i2c_algo_bit drm_ttm_helper ttm drm_kms_helper syscopyarea sysfillrect
Jan 03 11:21:56 pop-os kernel: sysimgblt fb_sys_fops cec rc_core libphy ahci xhci_pci drm libahci xhci_pci_renesas wmi i2c_piix4 video gpio_amdpt gpio_generic
Jan 03 11:21:56 pop-os kernel: CR2: 0000000000000259
Jan 03 11:21:56 pop-os kernel: ---[ end trace a8b2cf5824589358 ]---
Jan 03 11:21:56 pop-os kernel: RIP: 0010:swake_up_locked+0x12/0x40
Jan 03 11:21:56 pop-os kernel: Code: 10 48 89 02 eb 83 f6 80 f9 07 00 00 01 0f 84 5a ff ff ff eb ad 0f 1f 00 48 8b 57 08 48 8d 47 08 48 39 c2 74 25 53 48 8b 5f 08 <48> 8b 7b f8 e8 05 3e fe ff 48 8b 13 48 8b 43 08 48 89 42 08 48 89
Jan 03 11:21:56 pop-os kernel: RSP: 0018:ffffae3a8111fe80 EFLAGS: 00010007
Jan 03 11:21:56 pop-os kernel: RAX: ffff948bad7018b0 RBX: 0000000000000000 RCX: 00000001004fea90
Jan 03 11:21:56 pop-os kernel: RDX: 0000000000000000 RSI: ffff9489f25eaa30 RDI: ffff948bad7018a8
Jan 03 11:21:56 pop-os kernel: RBP: ffff948bad7018a8 R08: 0000000000000001 R09: 0000000000000052
Jan 03 11:21:56 pop-os kernel: R10: ffff948ad6d97000 R11: ffff948ad6d97000 R12: 0000000000000286
Jan 03 11:21:56 pop-os kernel: R13: ffff948ad7c4ecc8 R14: ffff948bad7018a0 R15: ffff948acf3f4e40
Jan 03 11:21:56 pop-os kernel: FS: 0000000000000000(0000) GS:ffff948dcea40000(0000) knlGS:0000000000000000
Jan 03 11:21:56 pop-os kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 03 11:21:56 pop-os kernel: CR2: 0000000000000259 CR3: 000000010d05a000 CR4: 00000000003506e0
Jan 03 11:21:57 pop-os spotify.desktop[98989]: [+] cef_urlrequest_create: https://spclient.wg.spotify.com/connect-state/v1/devices/7b26dfe4a089446f295af30a40e760de4d531544
Jan 03 11:21:57 pop-os spotify.desktop[98989]: [+] cef_urlrequest_create: https://spclient.wg.spotify.com/connect-state/v1/devices/7b26dfe4a089446f295af30a40e760de4d531544
Jan 03 11:22:04 pop-os kernel: [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences timed out!
Jan 03 11:22:04 pop-os kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring uvd timeout, signaled seq=216, emitted seq=216
Jan 03 11:22:04 pop-os kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process brave pid 7135 thread brave:cs0 pid 7140
Jan 03 11:22:04 pop-os kernel: amdgpu 0000:01:00.0: amdgpu: GPU reset begin!
Jan 03 11:22:04 pop-os kernel: ------------[ cut here ]------------
Jan 03 11:22:04 pop-os kernel: WARNING: CPU: 1 PID: 100306 at kthread_park+0x68/0x80
Jan 03 11:22:04 pop-os kernel: Modules linked in: ses enclosure scsi_transport_sas uas usb_storage cdc_acm ntfs3 cfg80211 ntfs snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi ledtrig_audio nls_iso8859_1 snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi intel_rapl_msr snd_hda_codec intel_rapl_common snd_hda_core snd_hwdep snd_pcm edac_mce_amd snd_seq_midi snd_seq_midi_event snd_rawmidi kvm_amd kvm r8188eu(C) snd_seq rapl joydev input_leds snd_seq_device snd_timer efi_pstore wmi_bmof snd ccp k10temp soundcore mac_hid sch_fq_codel msr parport_pc ppdev lp parport binfmt_misc ip_tables x_tables autofs4 dm_crypt raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear system76_io(OE) system76_acpi(OE) hid_generic usbhid hid amdgpu r8169 realtek mdio_devres crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd iommu_v2 gpu_sched i2c_algo_bit drm_ttm_helper ttm drm_kms_helper syscopyarea sysfillrect
Jan 03 11:22:04 pop-os kernel: sysimgblt fb_sys_fops cec rc_core libphy ahci xhci_pci drm libahci xhci_pci_renesas wmi i2c_piix4 video gpio_amdpt gpio_generic
Jan 03 11:22:04 pop-os kernel: CPU: 1 PID: 100306 Comm: kworker/1:1 Tainted: G D C OE 5.15.12-xanmod1 #0~git20211229.8293471
Jan 03 11:22:04 pop-os kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./A320M-HDV R4.0, BIOS P2.30 06/26/2019
Jan 03 11:22:04 pop-os kernel: Workqueue: events drm_sched_job_timedout [gpu_sched]
Jan 03 11:22:04 pop-os kernel: RIP: 0010:kthread_park+0x68/0x80
Jan 03 11:22:04 pop-os kernel: Code: 20 e8 1c 00 ab 00 be 40 00 00 00 48 89 ef e8 df 06 01 00 48 85 c0 74 25 31 c0 5b 5d c3 0f 0b 48 8b 9d 60 06 00 00 a8 04 74 b2 <0f> 0b b8 da ff ff ff 5b 5d c3 0f 0b b8 f0 ff ff ff eb dd 0f 0b eb
Jan 03 11:22:04 pop-os kernel: RSP: 0018:ffffae3a881cfce8 EFLAGS: 00010202
Jan 03 11:22:04 pop-os kernel: RAX: 0000000000208044 RBX: ffff948ad7204900 RCX: 0000000000000000
Jan 03 11:22:04 pop-os kernel: RDX: 0000000000000000 RSI: ffff9489f25ea800 RDI: ffff948ad7c78000
Jan 03 11:22:04 pop-os kernel: RBP: ffff948ad7c78000 R08: 0000000000000000 R09: 0000000000000001
Jan 03 11:22:04 pop-os kernel: R10: 0000000000000001 R11: 00000000000002e9 R12: ffff948ad7c4eb50
Jan 03 11:22:04 pop-os kernel: R13: 0000000000000000 R14: ffff948ad7c4ecb8 R15: 0000000000000060
Jan 03 11:22:04 pop-os kernel: FS: 0000000000000000(0000) GS:ffff948dcea40000(0000) knlGS:0000000000000000
Jan 03 11:22:04 pop-os kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 03 11:22:04 pop-os kernel: CR2: 000021e80055f000 CR3: 0000000115d4a000 CR4: 00000000003506e0
Jan 03 11:22:04 pop-os kernel: Call Trace:
Jan 03 11:22:04 pop-os kernel: <TASK>
Jan 03 11:22:04 pop-os kernel: drm_sched_stop+0x2d/0x160 [gpu_sched]
Jan 03 11:22:04 pop-os kernel: ? down+0x15/0x50
Jan 03 11:22:04 pop-os kernel: amdgpu_device_gpu_recover.cold+0xa18/0xa50 [amdgpu]
Jan 03 11:22:04 pop-os kernel: amdgpu_job_timedout+0x14a/0x170 [amdgpu]
Jan 03 11:22:04 pop-os kernel: drm_sched_job_timedout+0x60/0xf0 [gpu_sched]
Jan 03 11:22:04 pop-os kernel: process_one_work+0x1f7/0x360
Jan 03 11:22:04 pop-os kernel: worker_thread+0x4b/0x400
Jan 03 11:22:04 pop-os kernel: ? process_one_work+0x360/0x360
Jan 03 11:22:04 pop-os kernel: kthread+0x11f/0x140
Jan 03 11:22:04 pop-os kernel: ? set_kthread_struct+0x30/0x30
Jan 03 11:22:04 pop-os kernel: ret_from_fork+0x1f/0x30
Jan 03 11:22:04 pop-os kernel: </TASK>
Jan 03 11:22:04 pop-os kernel: ---[ end trace a8b2cf5824589359 ]---
Jan 03 11:22:16 pop-os gnome-shell[7093]: [0103/112216.050373:WARNING:exception_snapshot_linux.cc(427)] Unhandled signal -1
Jan 03 11:22:53 pop-os spotify.desktop[98989]: [+] cef_urlrequest_create: https://i.scdn.co/image/ab67616d00001e02bca27e89b082e7fa21a6b1e9
Jan 03 11:22:53 pop-os spotify.desktop[98989]: [+] cef_urlrequest_create: https://spclient.wg.spotify.com/extended-metadata/v0/extended-metadata
Jan 03 11:22:53 pop-os spotify.desktop[98989]: [+] cef_urlrequest_create: https://spclient.wg.spotify.com/connect-state/v1/devices/7b26dfe4a089446f295af30a40e760de4d531544
Jan 03 11:22:53 pop-os spotify.desktop[98989]: [+] cef_urlrequest_create: https://spclient.wg.spotify.com/net-fortune/v2/fortune
Jan 03 11:22:58 pop-os spotify.desktop[98989]: [+] cef_urlrequest_create: https://spclient.wg.spotify.com/storage-resolve/v2/files/audio/interactive_prefetch/1/e4e56fdfed08e82a66c205ca3f056975bb16bad3?product=0
Jan 03 11:22:58 pop-os spotify.desktop[98989]: [+] cef_urlrequest_create: https://spclient.wg.spotify.com/playplay/v1/key/58de44c133b247800e0f7b15c408a05e8a5a5f35
Jan 03 11:23:18 pop-os /usr/libexec/gdm-x-session[1069]: (EE) event6 - Gaming Keyboard: client bug: event processing lagging behind by 25ms, your system is too slow
Jan 03 11:23:18 pop-os /usr/libexec/gdm-x-session[1069]: (EE) event6 - Gaming Keyboard: WARNING: log rate limit exceeded (5 msgs per 60min). Discarding future messages.
Jan 03 11:23:56 pop-os kernel: usb 1-2: USB disconnect, device number 13
Jan 03 11:23:56 pop-os gvfsd[133497]: PTP: reading event an error 0x05 occurred
Jan 03 11:23:57 pop-os gvfsd[133497]: Device 0 (VID=04e8 and PID=6860) is a Samsung Galaxy models (MTP).
Jan 03 11:23:57 pop-os gvfsd[133497]: Android device detected, assigning default bug flags
Jan 03 11:23:57 pop-os gvfsd[133497]: Received event PTP_EC_DevicePropChanged in session 1
Jan 03 11:23:57 pop-os gvfsd[133497]: Received event PTP_EC_StoreAdded in session 1
Jan 03 11:23:57 pop-os gvfsd[133497]: Received event PTP_EC_StoreAdded in session 1
Jan 03 11:23:57 pop-os gvfsd[133497]: Received event PTP_EC_StoreAdded in session 1
Jan 03 11:23:57 pop-os gvfsd[133497]: Received event PTP_EC_DevicePropChanged in session 1
Finally found an issue similar to the one I'm running into. I'll add some info if it helps speed things along with a resolution.
I've run into this problem many times and it's always been with my Radeon RX 6700XT. It appears in Xubuntu as well, and I switched to Pop! OS in the hopes that the drivers and firmware should already be present (didn't have to install anything). Crashes ended up occurring frequently enough that I had to switch to an older NVIDIA GT 1030 card that I have no issues running. The 6700XT isn't a dud either. It works fine in Win 10, with the same hardware setup, running reasonably graphics intensive games.
To summarize, the symptoms are the same to what @Dark-Matter7232 has said:
[drm:amdgpu_dm_commit_planes [amdgpu]] *ERROR* Waiting for fences timed out!
Doing a little bit of research, my guess is we're waiting on a buffer in RAM to free up, but something (gpu, driver, mesa, etc) takes a while.Reproduction and Mitigation It's certainly an intermittent issue, and I can't yet figure out how to exacerbate it to occur consistently.
amdgpu
. Nvidia card unaffected. Processor is Intel without integrated graphics.I've tried mitigating as best I could with the following actions to no avail:
/etc/profile.d
might make it worse? Was intended to fix things... Uncommenting the top export line definitely didn't help.
#!/bin/sh
#export AMD_DEBUG="nongg,nodma"
export AMD_DEBUG="nongg"
I've not yet tried the following:
Pop! OS Distribution
cat /etc/os-release
NAME="Pop!_OS"
VERSION="21.10"
ID=pop
ID_LIKE="ubuntu debian"
PRETTY_NAME="Pop!_OS 21.10"
VERSION_ID="21.10"
HOME_URL="https://pop.system76.com"
SUPPORT_URL="https://support.system76.com"
BUG_REPORT_URL="https://github.com/pop-os/pop/issues"
PRIVACY_POLICY_URL="https://system76.com/privacy"
VERSION_CODENAME=impish
UBUNTU_CODENAME=impish
LOGO=distributor-logo-pop-os
Hardware Info
inxi -Fxxxrz
System:
Kernel: 5.15.15-76051515-generic x86_64 bits: 64 compiler: gcc v: 11.2.0
Desktop: GNOME 40.5 tk: GTK 3.24.30 wm: gnome-shell dm: GDM3 41.rc
Distro: Pop!_OS 21.10 base: Ubuntu Impish
Machine:
Type: Desktop System: ASUS product: N/A v: N/A serial: <filter>
Mobo: ASUSTeK model: ROG STRIX Z490-E GAMING v: Rev 1.xx serial: <filter>
UEFI: American Megatrends v: 2004 date: 01/13/2021
CPU:
Info: 8-Core model: Intel Core i7-10700F bits: 64 type: MT MCP
arch: Comet Lake rev: 5 cache: L2: 16 MiB
flags: avx avx2 lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3
bogomips: 92796
Speed: 800 MHz min/max: 800/4800 MHz Core speeds (MHz): 1: 800 2: 4417
3: 867 4: 3222 5: 1897 6: 1114 7: 800 8: 800 9: 800 10: 800 11: 800
12: 800 13: 800 14: 800 15: 800 16: 800
Graphics:
Device-1: AMD Navi 22 [Radeon RX 6700/6700 XT / 6800M] vendor: ASUSTeK
driver: amdgpu v: kernel bus-ID: 03:00.0 chip-ID: 1002:73df class-ID: 0300
Display: x11 server: X.Org 1.20.13 compositor: gnome-shell driver:
loaded: amdgpu,ati unloaded: fbdev,modesetting,radeon,vesa
resolution: 1680x1050~60Hz s-dpi: 96
OpenGL: renderer: AMD Radeon RX 6700 XT (NAVY_FLOUNDER DRM 3.42.0
5.15.15-76051515-generic LLVM 12.0.1)
v: 4.6 Mesa 21.2.2 direct render: Yes
Audio:
Device-1: Intel Comet Lake PCH cAVS vendor: ASUSTeK driver: snd_hda_intel
v: kernel bus-ID: 00:1f.3 chip-ID: 8086:06c8 class-ID: 0403
Device-2: AMD Navi 21 HDMI Audio [Radeon RX 6800/6800 XT / 6900 XT]
driver: snd_hda_intel v: kernel bus-ID: 03:00.1 chip-ID: 1002:ab28
class-ID: 0403
Sound Server-1: ALSA v: k5.15.15-76051515-generic running: yes
Sound Server-2: PulseAudio v: 15.0 running: yes
Sound Server-3: PipeWire v: 0.3.32 running: yes
Network:
Device-1: Intel Comet Lake PCH CNVi WiFi driver: iwlwifi v: kernel
bus-ID: 00:14.3 chip-ID: 8086:06f0 class-ID: 0280
IF: wlo1 state: down mac: <filter>
Device-2: Intel Ethernet I225-V vendor: ASUSTeK driver: igc v: kernel
port: 3000 bus-ID: 06:00.0 chip-ID: 8086:15f3 class-ID: 0200
IF: enp6s0 state: up speed: 1000 Mbps duplex: full mac: <filter>
Bluetooth:
Device-1: Intel type: USB driver: btusb v: 0.8 bus-ID: 1-14:7
chip-ID: 8087:0026 class-ID: e001
Report: hciconfig ID: hci0 rfk-id: 0 state: up address: <filter> bt-v: 3.0
lmp-v: 5.2 sub-v: 27a4 hci-v: 5.2 rev: 27a4
Drives:
Local Storage: total: 2.73 TiB used: 490.47 GiB (17.6%)
ID-1: /dev/nvme0n1 vendor: Samsung model: SSD 970 EVO Plus 2TB
size: 1.82 TiB speed: 31.6 Gb/s lanes: 4 type: SSD serial: <filter>
rev: 2B2QEXM7 temp: 35.9 C scheme: GPT
ID-2: /dev/nvme1n1 vendor: Samsung model: SSD 970 EVO Plus 1TB
size: 931.51 GiB speed: 31.6 Gb/s lanes: 4 type: SSD serial: <filter>
rev: 2B2QEXM7 temp: 38.9 C scheme: GPT
Partition:
ID-1: / size: 63.67 GiB used: 14.33 GiB (22.5%) fs: ext4
dev: /dev/nvme1n1p6
ID-2: /boot/efi size: 487 MiB used: 184.7 MiB (37.9%) fs: vfat
dev: /dev/nvme1n1p5
ID-3: /home size: 562.29 GiB used: 39.66 GiB (7.1%) fs: ext4
dev: /dev/nvme1n1p7
Swap:
Alert: No swap data was found.
Sensors:
System Temperatures: cpu: 27.8 C mobo: N/A gpu: amdgpu temp: 34.0 C
mem: 30.0 C
Fan Speeds (RPM): N/A gpu: amdgpu fan: 0
Repos:
Packages: 1977 apt: 1934 flatpak: 40 snap: 3
No active apt repos in: /etc/apt/sources.list
Active apt repos in: /etc/apt/sources.list.d/pop-os-apps.sources
1: deb http://apt.pop-os.org/proprietary impish main
Active apt repos in: /etc/apt/sources.list.d/pop-os-ppa.sources
1: deb deb-src http://apt.pop-os.org/release impish main
Active apt repos in: /etc/apt/sources.list.d/system.sources
1: deb deb-src http://us.archive.ubuntu.com/ubuntu/ impish impish-security impish-updates impish-backports main restricted universe multiverse
2: deb deb-src X-Repolib-Default-Mirror: http://us.archive.ubuntu.com/ubuntu/ impish impish-security impish-updates impish-backports main restricted universe multiverse
Info:
Processes: 427 Uptime: 33m wakeups: 0 Memory: 31.25 GiB
used: 4.5 GiB (14.4%) Init: systemd v: 248 runlevel: 5 Compilers:
gcc: 11.2.0 alt: 10/11 Shell: Bash v: 5.1.8 running-in: gnome-terminal
inxi: 3.3.06
Mesa Info
ii libegl-mesa0:amd64 21.2.2-1ubuntu1pop0~1634226723~21.10~b715ae2 amd64 free implementation of the EGL API -- Mesa vendor library
ii libgl1-mesa-dri:amd64 21.2.2-1ubuntu1pop0~1634226723~21.10~b715ae2 amd64 free implementation of the OpenGL API -- DRI modules
ii libglapi-mesa:amd64 21.2.2-1ubuntu1pop0~1634226723~21.10~b715ae2 amd64 free implementation of the GL API -- shared library
ii libglu1-mesa:amd64 9.0.1-1build1 amd64 Mesa OpenGL utility library (GLU)
ii libglx-mesa0:amd64 21.2.2-1ubuntu1pop0~1634226723~21.10~b715ae2 amd64 free implementation of the OpenGL API -- GLX vendor library
ii mesa-utils 8.4.0-1build1 amd64 Miscellaneous Mesa GL utilities
ii mesa-va-drivers:amd64 21.2.2-1ubuntu1pop0~1634226723~21.10~b715ae2 amd64 Mesa VA-API video acceleration drivers
ii mesa-vdpau-drivers:amd64 21.2.2-1ubuntu1pop0~1634226723~21.10~b715ae2 amd64 Mesa VDPAU video acceleration drivers
ii mesa-vulkan-drivers:amd64 21.2.2-1ubuntu1pop0~1634226723~21.10~b715ae2 amd64 Mesa Vulkan graphics drivers
AMDGPU Firmware Info
sudo cp /sys/kernel/debug/dri/0/amdgpu_firmware_info
VCE feature version: 0, firmware version: 0x00000000
UVD feature version: 0, firmware version: 0x00000000
MC feature version: 0, firmware version: 0x00000000
ME feature version: 38, firmware version: 0x0000003e
PFP feature version: 38, firmware version: 0x00000056
CE feature version: 38, firmware version: 0x00000024
RLC feature version: 1, firmware version: 0x00000042
RLC SRLC feature version: 0, firmware version: 0x00000000
RLC SRLG feature version: 0, firmware version: 0x00000000
RLC SRLS feature version: 0, firmware version: 0x00000000
MEC feature version: 38, firmware version: 0x00000058
MEC2 feature version: 38, firmware version: 0x00000058
SOS feature version: 0, firmware version: 0x0022020a
ASD feature version: 553648218, firmware version: 0x2100005a
TA XGMI feature version: 0x00000000, firmware version: 0x00000000
TA RAS feature version: 0x00000000, firmware version: 0x00000000
TA HDCP feature version: 0x1700001f, firmware version: 0x00000000
TA DTM feature version: 0x12000009, firmware version: 0x00000000
TA RAP feature version: 0x0700000e, firmware version: 0x00000000
TA SECUREDISPLAY feature version: 0x00000000, firmware version: 0x00000000
SMC feature version: 0, firmware version: 0x00412a00
SDMA0 feature version: 52, firmware version: 0x00000045
SDMA1 feature version: 52, firmware version: 0x00000045
VCN feature version: 0, firmware version: 0x02110001
DMCU feature version: 0, firmware version: 0x00000000
DMCUB feature version: 0, firmware version: 0x02020003
TOC feature version: 0, firmware version: 0x00000000
VBIOS version: 115-D512BS0-100
Last Failure (2/2/2022, 08:03:20)
Trimmed from /var/log/syslog
Feb 2 08:03:20 kernel: [37932.038809] [drm:amdgpu_dm_commit_planes [amdgpu]] *ERROR* Waiting for fences timed out!
Feb 2 08:03:20 kernel: [37937.168911] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=228661, emitted seq=228662
Feb 2 08:03:20 kernel: [37937.169061] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process Xorg pid 2184 thread Xorg:cs0 pid 2284
Feb 2 08:03:20 kernel: [37937.169164] amdgpu 0000:03:00.0: amdgpu: GPU reset begin!
Feb 2 08:03:20 kernel: [37937.396684] amdgpu 0000:03:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110)
Feb 2 08:03:20 kernel: [37937.396771] [drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* KGQ disable failed
Feb 2 08:03:20 kernel: [37937.591477] amdgpu 0000:03:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110)
Feb 2 08:03:20 kernel: [37937.591549] [drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* KCQ disable failed
Feb 2 08:03:20 kernel: [37937.786506] [drm:gfx_v10_0_cp_gfx_enable.isra.0 [amdgpu]] *ERROR* failed to halt cp gfx
Feb 2 08:03:20 kernel: [37937.800525] [drm] free PSP TMR buffer
Feb 2 08:03:20 kernel: [37937.845178] amdgpu 0000:03:00.0: amdgpu: MODE1 reset
Feb 2 08:03:20 kernel: [37937.845180] amdgpu 0000:03:00.0: amdgpu: GPU mode1 reset
Feb 2 08:03:20 kernel: [37937.845228] amdgpu 0000:03:00.0: amdgpu: GPU smu mode1 reset
Feb 2 08:03:21 kernel: [37938.375075] amdgpu 0000:03:00.0: amdgpu: GPU reset succeeded, trying to resume
Feb 2 08:03:21 kernel: [37938.375271] [drm] PCIE GART of 512M enabled (table at 0x0000008000300000).
Feb 2 08:03:21 kernel: [37938.375308] [drm] VRAM is lost due to GPU reset!
Feb 2 08:03:21 kernel: [37938.376232] [drm] PSP is resuming...
Feb 2 08:03:21 kernel: [37938.567818] [drm] reserve 0xa00000 from 0x82fe000000 for PSP TMR
Feb 2 08:03:21 kernel: [37938.646867] amdgpu 0000:03:00.0: amdgpu: RAS: optional ras ta ucode is not available
Feb 2 08:03:21 kernel: [37938.657253] amdgpu 0000:03:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
Feb 2 08:03:21 kernel: [37938.657254] amdgpu 0000:03:00.0: amdgpu: SMU is resuming...
Feb 2 08:03:21 kernel: [37938.712547] amdgpu 0000:03:00.0: amdgpu: SMU is resumed successfully!
Feb 2 08:03:21 kernel: [37938.713886] [drm] DMUB hardware initialized: version=0x02020003
Feb 2 08:03:22 kernel: [37939.025119] [drm] kiq ring mec 2 pipe 1 q 0
Feb 2 08:03:22 kernel: [37939.028387] [drm] VCN decode and encode initialized successfully(under DPG Mode).
Feb 2 08:03:22 kernel: [37939.028729] [drm] JPEG decode initialized successfully.
Feb 2 08:03:22 kernel: [37939.028754] amdgpu 0000:03:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
Feb 2 08:03:22 kernel: [37939.028756] amdgpu 0000:03:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
Feb 2 08:03:22 kernel: [37939.028756] amdgpu 0000:03:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
Feb 2 08:03:22 kernel: [37939.028757] amdgpu 0000:03:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 5 on hub 0
Feb 2 08:03:22 kernel: [37939.028758] amdgpu 0000:03:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 6 on hub 0
Feb 2 08:03:22 kernel: [37939.028758] amdgpu 0000:03:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 7 on hub 0
Feb 2 08:03:22 kernel: [37939.028759] amdgpu 0000:03:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 8 on hub 0
Feb 2 08:03:22 kernel: [37939.028759] amdgpu 0000:03:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 9 on hub 0
Feb 2 08:03:22 kernel: [37939.028760] amdgpu 0000:03:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 10 on hub 0
Feb 2 08:03:22 kernel: [37939.028760] amdgpu 0000:03:00.0: amdgpu: ring kiq_2.1.0 uses VM inv eng 11 on hub 0
Feb 2 08:03:22 kernel: [37939.028761] amdgpu 0000:03:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0
Feb 2 08:03:22 kernel: [37939.028762] amdgpu 0000:03:00.0: amdgpu: ring sdma1 uses VM inv eng 13 on hub 0
Feb 2 08:03:22 kernel: [37939.028762] amdgpu 0000:03:00.0: amdgpu: ring vcn_dec_0 uses VM inv eng 0 on hub 1
Feb 2 08:03:22 kernel: [37939.028763] amdgpu 0000:03:00.0: amdgpu: ring vcn_enc_0.0 uses VM inv eng 1 on hub 1
Feb 2 08:03:22 kernel: [37939.028764] amdgpu 0000:03:00.0: amdgpu: ring vcn_enc_0.1 uses VM inv eng 4 on hub 1
Feb 2 08:03:22 kernel: [37939.028764] amdgpu 0000:03:00.0: amdgpu: ring jpeg_dec uses VM inv eng 5 on hub 1
Feb 2 08:03:22 kernel: [37939.033836] amdgpu 0000:03:00.0: amdgpu: recover vram bo from shadow start
Feb 2 08:03:22 kernel: [37939.033849] amdgpu 0000:03:00.0: amdgpu: recover vram bo from shadow done
Feb 2 08:03:22 kernel: [37939.033850] [drm] Skip scheduling IBs!
Feb 2 08:03:22 kernel: [37939.033884] amdgpu 0000:03:00.0: amdgpu: GPU reset(2) succeeded!
Feb 2 08:03:22 kernel: [37939.033893] [drm] Skip scheduling IBs!
Feb 2 08:03:22 gnome-shell[2521]: amdgpu: amdgpu_cs_query_fence_status failed.
Feb 2 08:03:22 kernel: [37939.033901] [drm] Skip scheduling IBs!
Feb 2 08:03:22 kernel: [37939.033924] [drm] Skip scheduling IBs!
Feb 2 08:03:22 kernel: [37939.034818] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
Feb 2 08:03:22 /usr/libexec/gdm-x-session[2184]: amdgpu: The CS has been cancelled because the context is lost.
Feb 2 08:03:22 gnome-shell[2521]: amdgpu: The CS has been cancelled because the context is lost.
Feb 2 08:03:22 gnome-shell[2521]: amdgpu: amdgpu_cs_query_fence_status failed.
I also started having this problem in Linux a few days ago, too. Installed a Sapphire RX 6600 Pulse, run stock Ubuntu 22.04. This might have already potentially been fixed in the kernel, see this discussion:
https://lore.kernel.org/all/dbadfe41-24bf-5811-cf38-74973df45214@badpenguin.co.uk/
As for my $.02, $ uname -a
yields:
Linux buttran 5.15.0-33-generic #34-Ubuntu SMP Wed May 18 13:34:26 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Relevant output from command $ journalctl -o short-precise -k -b -1
(slightly edited due to snap apparmor vomit):
maj 29 14:30:45.472809 buttran kernel: [drm:dc_dmub_srv_wait_idle [amdgpu]] *ERROR* Error waiting for DMUB idle: status=3
maj 29 14:30:47.780810 buttran kernel: snd_hda_intel 0000:03:00.1: refused to change power state from D3hot to D0
maj 29 14:30:47.881686 buttran kernel: snd_hda_intel 0000:03:00.1: CORB reset timeout#2, CORBRP = 65535
maj 29 14:30:48.164848 buttran kernel: snd_hda_codec_hdmi hdaudioC0D0: Unable to sync register 0x2f0d00. -5
maj 29 14:30:53.040809 buttran kernel: [drm:amdgpu_dm_commit_planes [amdgpu]] *ERROR* Waiting for fences timed out!
maj 29 14:30:53.040950 buttran kernel: [drm:dc_dmub_srv_wait_idle [amdgpu]] *ERROR* Error waiting for DMUB idle: status=3
maj 29 14:30:55.592803 buttran kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma1 timeout, signaled seq=8985, emitted seq=8987
maj 29 14:30:55.592917 buttran kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process pid 0 thread pid 0
maj 29 14:30:55.592945 buttran kernel: amdgpu 0000:03:00.0: amdgpu: GPU reset begin!
maj 29 14:31:00.248815 buttran kernel: amdgpu 0000:03:00.0: amdgpu: Failed to disable gfxoff!
maj 29 14:31:05.396825 buttran kernel: [drm:dc_dmub_srv_wait_idle [amdgpu]] *ERROR* Error waiting for DMUB idle: status=3
maj 29 14:31:05.676846 buttran kernel: [drm:dc_dmub_srv_wait_idle [amdgpu]] *ERROR* Error waiting for DMUB idle: status=3
maj 29 14:31:10.572822 buttran kernel: amdgpu 0000:03:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110)
maj 29 14:31:10.573170 buttran kernel: [drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* KGQ disable failed
maj 29 14:31:10.844809 buttran kernel: amdgpu 0000:03:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110)
maj 29 14:31:10.845113 buttran kernel: [drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* KCQ disable failed
maj 29 14:31:15.504813 buttran kernel: amdgpu 0000:03:00.0: amdgpu: SMU: I'm not done with your previous command!
maj 29 14:31:15.505278 buttran kernel: amdgpu 0000:03:00.0: amdgpu: Failed to disable smu features.
maj 29 14:31:15.505567 buttran kernel: amdgpu 0000:03:00.0: amdgpu: Fail to disable dpm features!
maj 29 14:31:15.505834 buttran kernel: [drm:amdgpu_device_ip_suspend_phase2 [amdgpu]] *ERROR* suspend of IP block <smu> failed -62
maj 29 14:31:15.528826 buttran kernel: [drm] free PSP TMR buffer
maj 29 14:31:16.624812 buttran kernel: [drm] psp gfx command DESTROY_TMR(0x7) failed and response status is (0x80000306)
maj 29 14:31:16.644825 buttran kernel: amdgpu 0000:03:00.0: amdgpu: MODE1 reset
maj 29 14:31:16.645321 buttran kernel: amdgpu 0000:03:00.0: amdgpu: GPU mode1 reset
maj 29 14:31:16.645556 buttran kernel: amdgpu 0000:03:00.0: amdgpu: GPU smu mode1 reset
maj 29 14:31:21.324822 buttran kernel: amdgpu 0000:03:00.0: amdgpu: SMU: I'm not done with your previous command!
maj 29 14:31:21.325175 buttran kernel: amdgpu 0000:03:00.0: amdgpu: GPU mode1 reset failed
maj 29 14:31:21.325412 buttran kernel: amdgpu 0000:03:00.0: amdgpu: ASIC reset failed with error, -62 for drm dev, 0000:03:00.0
maj 29 14:31:32.313042 buttran kernel: amdgpu 0000:03:00.0: amdgpu: GPU reset succeeded, trying to resume
maj 29 14:31:32.313591 buttran kernel: [drm] PCIE GART of 512M enabled (table at 0x0000008000300000).
maj 29 14:31:32.313647 buttran kernel: [drm] VRAM is lost due to GPU reset!
maj 29 14:31:32.313696 buttran kernel: [drm] PSP is resuming...
maj 29 14:31:33.432819 buttran kernel: [drm] failed to load ucode SMC(0x18)
maj 29 14:31:33.433024 buttran kernel: [drm] psp gfx command LOAD_IP_FW(0x6) failed and response status is (0x80000306)
maj 29 14:31:33.433094 buttran kernel: [drm] reserve 0xa00000 from 0x81fe000000 for PSP TMR
maj 29 14:31:33.704818 buttran kernel: amdgpu 0000:03:00.0: amdgpu: RAS: optional ras ta ucode is not available
maj 29 14:31:33.724824 buttran kernel: amdgpu 0000:03:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
maj 29 14:31:33.725379 buttran kernel: amdgpu 0000:03:00.0: amdgpu: SMU is resuming...
maj 29 14:31:33.725741 buttran kernel: amdgpu 0000:03:00.0: amdgpu: smu driver if version = 0x0000000f, smu fw if version = 0x00000013, smu fw version = 0x003b2800 (59.40.0)
maj 29 14:31:33.726087 buttran kernel: amdgpu 0000:03:00.0: amdgpu: SMU driver if version not matched
maj 29 14:31:38.420820 buttran kernel: amdgpu 0000:03:00.0: amdgpu: SMU: I'm not done with your previous command!
maj 29 14:31:38.421166 buttran kernel: amdgpu 0000:03:00.0: amdgpu: Failed to SetDriverDramAddr!
maj 29 14:31:38.421336 buttran kernel: amdgpu 0000:03:00.0: amdgpu: Failed to setup smc hw!
maj 29 14:31:38.421499 buttran kernel: [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block <smu> failed -62
maj 29 14:31:38.421526 buttran kernel: amdgpu 0000:03:00.0: amdgpu: GPU reset(1) failed
maj 29 14:31:38.440804 buttran kernel: snd_hda_intel 0000:03:00.1: refused to change power state from D3hot to D0
maj 29 14:31:38.541097 buttran kernel: snd_hda_intel 0000:03:00.1: CORB reset timeout#2, CORBRP = 65535
maj 29 14:31:38.541573 buttran kernel: amdgpu 0000:03:00.0: amdgpu: GPU reset end with ret = -62
maj 29 14:31:48.584822 buttran kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma1 timeout, signaled seq=8987, emitted seq=8987
maj 29 14:31:48.585020 buttran kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process pid 0 thread pid 0
maj 29 14:31:48.585079 buttran kernel: amdgpu 0000:03:00.0: amdgpu: GPU reset begin!
maj 29 14:35:31.556820 buttran kernel: INFO: task kworker/1:0:41074 blocked for more than 120 seconds.
maj 29 14:35:31.557014 buttran kernel: Not tainted 5.15.0-33-generic #34-Ubuntu
maj 29 14:35:31.557062 buttran kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
maj 29 14:35:31.557102 buttran kernel: task:kworker/1:0 state:D stack: 0 pid:41074 ppid: 2 flags:0x00004000
maj 29 14:35:31.557152 buttran kernel: Workqueue: events drm_sched_job_timedout [gpu_sched]
maj 29 14:35:31.557185 buttran kernel: Call Trace:
maj 29 14:35:31.557218 buttran kernel: <TASK>
maj 29 14:35:31.557248 buttran kernel: __schedule+0x23d/0x590
maj 29 14:35:31.557278 buttran kernel: schedule+0x4e/0xb0
maj 29 14:35:31.557308 buttran kernel: schedule_timeout+0xfb/0x140
maj 29 14:35:31.557344 buttran kernel: ? task_rq_lock+0x5f/0x150
maj 29 14:35:31.557377 buttran kernel: dma_fence_default_wait+0x1c4/0x1f0
maj 29 14:35:31.557417 buttran kernel: ? dma_fence_free+0x20/0x20
maj 29 14:35:31.557447 buttran kernel: dma_fence_wait_timeout+0xb7/0xd0
maj 29 14:35:31.557483 buttran kernel: drm_sched_stop+0xfc/0x170 [gpu_sched]
maj 29 14:35:31.557520 buttran kernel: amdgpu_device_gpu_recover.cold+0x85a/0x8f8 [amdgpu]
maj 29 14:35:31.557555 buttran kernel: amdgpu_job_timedout+0x14f/0x170 [amdgpu]
maj 29 14:35:31.557584 buttran kernel: drm_sched_job_timedout+0x6f/0x110 [gpu_sched]
maj 29 14:35:31.557614 buttran kernel: process_one_work+0x22b/0x3d0
maj 29 14:35:31.557712 buttran kernel: worker_thread+0x53/0x410
maj 29 14:35:31.557747 buttran kernel: ? process_one_work+0x3d0/0x3d0
maj 29 14:35:31.557783 buttran kernel: kthread+0x12a/0x150
maj 29 14:35:31.557823 buttran kernel: ? set_kthread_struct+0x50/0x50
maj 29 14:35:31.557857 buttran kernel: ret_from_fork+0x22/0x30
maj 29 14:35:31.557890 buttran kernel: </TASK>```
I have a similar situation. [drm:amdgpu_dm_commit_planes [amdgpu]] *ERROR* Waiting for fences timed out!
I'm using 6700xt.
I found that the source of the problem is probably here.
https://github.com/torvalds/linux/blob/master/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c#L9292
It looks like a deadlock prevention code, and is hardcoded with a timeout of 5000 ms.
So I guess using this amdgpu.lockup_timeout=4990
(less than 5000 ms) kernel parameter might be able to get around this problem.
I've been testing this for 3 days and it hasn't happened again, so I hope it works for you :) @cjfgraff
I also had similar problem. My laptop freezed and even REISUB was not helping
I have similar report in logs:
14:16:27.894416 pop-os kernel: INFO: task kworker/u32:11:8569 blocked for more than 120 seconds.
14:16:27.902946 pop-os kernel: Tainted: P OE 6.5.6-76060506-generic #202310061235~1697396945~22.04~9283e32
14:16:27.903001 pop-os kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
14:16:27.903034 pop-os kernel: task:kworker/u32:11 state:D stack:0 pid:8569 ppid:2 flags:0x00004000
14:16:27.903069 pop-os kernel: Workqueue: events_unbound commit_work [drm_kms_helper]
14:16:27.903098 pop-os kernel: Call Trace:
14:16:27.903124 pop-os kernel: <TASK>
14:16:27.903153 pop-os kernel: __schedule+0x2cc/0x750
14:16:27.903181 pop-os kernel: schedule+0x63/0x110
14:16:27.903209 pop-os kernel: schedule_timeout+0x157/0x170
14:16:27.903238 pop-os kernel: dma_fence_default_wait+0x13d/0x210
14:16:27.903272 pop-os kernel: ? __pfx_dma_fence_default_wait_cb+0x10/0x10
14:16:27.903301 pop-os kernel: dma_fence_wait_timeout+0x116/0x140
14:16:27.903330 pop-os kernel: drm_atomic_helper_wait_for_fences+0x172/0x200 [drm_kms_helper]
14:16:27.903359 pop-os kernel: ? srso_alias_return_thunk+0x5/0x7f
14:16:27.903387 pop-os kernel: commit_tail+0x3c/0x190 [drm_kms_helper]
14:16:27.903415 pop-os kernel: ? __schedule+0x2d4/0x750
14:16:27.903443 pop-os kernel: commit_work+0x12/0x20 [drm_kms_helper]
14:16:27.903471 pop-os kernel: process_one_work+0x240/0x450
14:16:27.903499 pop-os kernel: worker_thread+0x50/0x3f0
14:16:27.903528 pop-os kernel: ? __pfx_worker_thread+0x10/0x10
14:16:27.903561 pop-os kernel: kthread+0xf2/0x120
14:16:27.903590 pop-os kernel: ? __pfx_kthread+0x10/0x10
14:16:27.903618 pop-os kernel: ret_from_fork+0x47/0x70
14:16:27.903646 pop-os kernel: ? __pfx_kthread+0x10/0x10
14:16:27.903669 pop-os kernel: ret_from_fork_asm+0x1b/0x30
14:16:27.903698 pop-os kernel: </TASK>
14:16:27.903723 pop-os kernel: INFO: task kworker/u32:13:8570 blocked for more than 120 seconds.
14:16:27.903754 pop-os kernel: Tainted: P OE 6.5.6-76060506-generic #202310061235~1697396945~22.04~9283e32
14:16:27.903778 pop-os kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
14:16:27.903803 pop-os kernel: task:kworker/u32:13 state:D stack:0 pid:8570 ppid:2 flags:0x00004000
14:16:27.903832 pop-os kernel: Workqueue: events_unbound commit_work [drm_kms_helper]
14:16:27.903859 pop-os kernel: Call Trace:
14:16:27.903883 pop-os kernel: <TASK>
14:16:27.903908 pop-os kernel: __schedule+0x2cc/0x750
14:16:27.903931 pop-os kernel: schedule+0x63/0x110
14:16:27.903951 pop-os kernel: schedule_timeout+0x157/0x170
14:16:27.903975 pop-os kernel: dma_fence_default_wait+0x13d/0x210
14:16:27.904000 pop-os kernel: ? __pfx_dma_fence_default_wait_cb+0x10/0x10
14:16:27.904028 pop-os kernel: dma_fence_wait_timeout+0x116/0x140
14:16:27.904051 pop-os kernel: drm_atomic_helper_wait_for_fences+0x172/0x200 [drm_kms_helper]
14:16:27.904074 pop-os kernel: ? srso_alias_return_thunk+0x5/0x7f
14:16:27.904103 pop-os kernel: commit_tail+0x3c/0x190 [drm_kms_helper]
14:16:27.904127 pop-os kernel: ? __schedule+0x2d4/0x750
14:16:27.904152 pop-os kernel: commit_work+0x12/0x20 [drm_kms_helper]
14:16:27.904174 pop-os kernel: process_one_work+0x240/0x450
14:16:27.904197 pop-os kernel: worker_thread+0x50/0x3f0
14:16:27.904220 pop-os kernel: ? srso_alias_return_thunk+0x5/0x7f
14:16:27.904243 pop-os kernel: ? __pfx_worker_thread+0x10/0x10
14:16:27.904267 pop-os kernel: kthread+0xf2/0x120
14:16:27.904291 pop-os kernel: ? __pfx_kthread+0x10/0x10
14:16:27.904310 pop-os kernel: ret_from_fork+0x47/0x70
14:16:27.904333 pop-os kernel: ? __pfx_kthread+0x10/0x10
14:16:27.904362 pop-os kernel: ret_from_fork_asm+0x1b/0x30
14:16:27.904387 pop-os kernel: </TASK>
My machine: Asus ROG Flow X13 2022, AMD Ryzen™ 7 6800HS, RTX 3050Ti.
Should I report this problem anywhere else?
Distribution (run
cat /etc/os-release
): NAME="Pop!_OS" VERSION="21.10" ID=pop ID_LIKE="ubuntu debian" PRETTY_NAME="Pop!_OS 21.10" VERSION_ID="21.10" HOME_URL="https://pop.system76.com" SUPPORT_URL="https://support.system76.com" BUG_REPORT_URL="https://github.com/pop-os/pop/issues" PRIVACY_POLICY_URL="https://system76.com/privacy" VERSION_CODENAME=impish UBUNTU_CODENAME=impish LOGO=distributor-logo-pop-osRelated Application and/or Package Version (run
apt policy $PACKAGE NAME
): N/AIssue/Bug Description: System randomly freezes(everything gets locked out including keyboard), cursor still moves but I can't click on anything, hard reset is the only way to get the system working again.
Steps to reproduce (if you know): It's very random
Expected behavior: System to work as intended
Other Notes: I have provided both trimmed and full version of the journalctl log
Trimmed log:
Full log