Closed oliverbestmann closed 1 month ago
Since updating from 6.9.5 to to 6.9.6 (and 6.9.9) i get random gpu/drm related crashes after a few minutes of usage.
Jul 15 10:20:18 m1pro kernel: ------------[ cut here ]------------ Jul 15 10:20:18 m1pro kernel: asahi 406400000.gpu: Jobs may not exceed the credit limit, truncate. Jul 15 10:20:18 m1pro kernel: WARNING: CPU: 0 PID: 15794 at drivers/gpu/drm/scheduler/sched_main.c:140 drm_sched_can_queue+0x110/0x168 Jul 15 10:20:18 m1pro kernel: Modules linked in: uinput xt_conntrack nft_chain_nat xt_MASQUERADE nf_conntrack_netlink xt_addrtype nft_compat nf_tables qrtr rfcomm snd_seq_dummy snd_hrtimer snd_seq usbhid cdc_mbim cdc_wdm cdc_ncm cdc_ether usbnet mii snd_usb_audio snd_h> Jul 15 10:20:18 m1pro kernel: nvmem_spmi_mfd rtc_macsmc gpio_macsmc spi_hid_apple_of simple_mfd_spmi tps6598x spi_hid_apple regmap_spmi dwc3 pcie_apple udc_core pci_host_common nvme_apple i2c_pasemi_platform spi_apple i2c_pasemi_core apple_sart macsmc_rtkit nvmem_appl> Jul 15 10:20:18 m1pro kernel: CPU: 0 PID: 15794 Comm: chromium Tainted: G S W 6.9.9-asahi #1-NixOS Jul 15 10:20:18 m1pro kernel: Hardware name: Apple MacBook Pro (14-inch, M1 Pro, 2021) (DT) Jul 15 10:20:18 m1pro kernel: pstate: 61401009 (nZCv daif +PAN -UAO -TCO +DIT +SSBS BTYPE=--) Jul 15 10:20:18 m1pro kernel: pc : drm_sched_can_queue+0x110/0x168 Jul 15 10:20:18 m1pro kernel: lr : drm_sched_can_queue+0x110/0x168 Jul 15 10:20:18 m1pro kernel: sp : ffff800090397440 Jul 15 10:20:18 m1pro kernel: x29: ffff800090397440 x28: 0000000000000030 x27: ffff000014ad5000 Jul 15 10:20:18 m1pro kernel: x26: ffff80007a55d948 x25: 0000000000000000 x24: ffff000139b5dc00 Jul 15 10:20:18 m1pro kernel: x23: ffff800090397888 x22: ffff000139b5cb38 x21: ffff0005be57f5d8 Jul 15 10:20:18 m1pro kernel: x20: ffff00013bfb1c08 x19: ffff00013bfb1c08 x18: 0000000000000000 Jul 15 10:20:18 m1pro kernel: x17: 0000000000000000 x16: 0000000000000000 x15: 6572632065687420 Jul 15 10:20:18 m1pro kernel: x14: 6465656378652074 x13: 0000000000000000 x12: 0000000000000000 Jul 15 10:20:18 m1pro kernel: x11: 0000000000000000 x10: 0000000000000000 x9 : 0000000000000000 Jul 15 10:20:18 m1pro kernel: x8 : 0000000000000000 x7 : 0000000000000000 x6 : 0000000000000000 Jul 15 10:20:18 m1pro kernel: x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000000000 Jul 15 10:20:18 m1pro kernel: x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000000000 Jul 15 10:20:18 m1pro kernel: Call trace: Jul 15 10:20:18 m1pro kernel: drm_sched_can_queue+0x110/0x168 Jul 15 10:20:18 m1pro kernel: drm_sched_wakeup+0x18/0x7c Jul 15 10:20:18 m1pro kernel: drm_sched_entity_push_job+0x174/0x1e8 Jul 15 10:20:18 m1pro kernel: _RNvXsK_NtCsirMamryJlsQ_5asahi5queueNtB5_13QueueG13V13_5NtB5_5Queue6submit+0x12d8/0x1578 [asahi] Jul 15 10:20:18 m1pro kernel: _RNvNvXs_NtCsirMamryJlsQ_5asahi6driverNtB6_11AsahiDriverNtNtNtCsc1LFWrxnNA7_6kernel3drm3drv6Driver6IOCTLS12ASAHI_SUBMIT+0x648/0x840 [asahi] Jul 15 10:20:18 m1pro kernel: drm_ioctl_kernel+0xd4/0x13c Jul 15 10:20:18 m1pro kernel: drm_ioctl+0x23c/0x4e4 Jul 15 10:20:18 m1pro kernel: __arm64_sys_ioctl+0xc0/0x118 Jul 15 10:20:18 m1pro kernel: invoke_syscall.constprop.0+0x50/0x124 Jul 15 10:20:18 m1pro kernel: do_el0_svc+0x40/0xf0 Jul 15 10:20:18 m1pro kernel: el0_svc+0x34/0x11c Jul 15 10:20:18 m1pro kernel: el0t_64_sync_handler+0x140/0x14c Jul 15 10:20:18 m1pro kernel: el0t_64_sync+0x190/0x194 Jul 15 10:20:18 m1pro kernel: ---[ end trace 0000000000000000 ]--- Jul 15 10:20:18 m1pro kernel: Unable to handle kernel paging request at virtual address 006120492079636d Jul 15 10:20:18 m1pro kernel: Mem abort info: Jul 15 10:20:18 m1pro kernel: ESR = 0x0000000096000004 Jul 15 10:20:18 m1pro kernel: EC = 0x25: DABT (current EL), IL = 32 bits Jul 15 10:20:18 m1pro kernel: SET = 0, FnV = 0 Jul 15 10:20:18 m1pro kernel: EA = 0, S1PTW = 0 Jul 15 10:20:18 m1pro kernel: FSC = 0x04: level 0 translation fault Jul 15 10:20:18 m1pro kernel: Data abort info: Jul 15 10:20:18 m1pro kernel: ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 Jul 15 10:20:18 m1pro kernel: CM = 0, WnR = 0, TnD = 0, TagAccess = 0 Jul 15 10:20:18 m1pro kernel: GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 Jul 15 10:20:18 m1pro kernel: [006120492079636d] address between user and kernel address ranges Jul 15 10:20:18 m1pro kernel: Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP Jul 15 10:20:18 m1pro kernel: Modules linked in: uinput xt_conntrack nft_chain_nat xt_MASQUERADE nf_conntrack_netlink xt_addrtype nft_compat nf_tables qrtr rfcomm snd_seq_dummy snd_hrtimer snd_seq usbhid cdc_mbim cdc_wdm cdc_ncm cdc_ether usbnet mii snd_usb_audio snd_h> Jul 15 10:20:18 m1pro kernel: nvmem_spmi_mfd rtc_macsmc gpio_macsmc spi_hid_apple_of simple_mfd_spmi tps6598x spi_hid_apple regmap_spmi dwc3 pcie_apple udc_core pci_host_common nvme_apple i2c_pasemi_platform spi_apple i2c_pasemi_core apple_sart macsmc_rtkit nvmem_appl> Jul 15 10:20:18 m1pro kernel: CPU: 0 PID: 15794 Comm: chromium Tainted: G S W 6.9.9-asahi #1-NixOS Jul 15 10:20:18 m1pro kernel: Hardware name: Apple MacBook Pro (14-inch, M1 Pro, 2021) (DT) Jul 15 10:20:18 m1pro kernel: pstate: 21401009 (nzCv daif +PAN -UAO -TCO +DIT +SSBS BTYPE=--) Jul 15 10:20:18 m1pro kernel: pc : __kmalloc_node_track_caller+0xec/0x2bc Jul 15 10:20:18 m1pro kernel: lr : __kmalloc_node_track_caller+0x98/0x2bc Jul 15 10:20:18 m1pro kernel: sp : ffff800090395d40 Jul 15 10:20:18 m1pro kernel: x29: ffff800090395d50 x28: 00000000ffffffa0 x27: ffff000639ee3280 Jul 15 10:20:18 m1pro kernel: x26: ffffffa00000c984 x25: 0000000000212a9c x24: 0000000000000000 Jul 15 10:20:18 m1pro kernel: x23: 736120492079616d x22: 00000000ffffffff x21: 0000000000000cc0 Jul 15 10:20:18 m1pro kernel: x20: ffff000001f2cb00 x19: 0000000000000318 x18: 00000000000000ff Jul 15 10:20:18 m1pro kernel: x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000 Jul 15 10:20:18 m1pro kernel: x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000 Jul 15 10:20:18 m1pro kernel: x11: 00000000ffffffa0 x10: 0000000000000008 x9 : ffffffffffffffff Jul 15 10:20:18 m1pro kernel: x8 : c98580007a45d9c4 x7 : 0000000000000cc0 x6 : 0000000000000318 Jul 15 10:20:18 m1pro kernel: x5 : 0000000000000000 x4 : 0000000000000000 x3 : 00000000064ce340 Jul 15 10:20:18 m1pro kernel: x2 : 0000000000000200 x1 : 736120492079616d x0 : ffff000001f2cb00 Jul 15 10:20:18 m1pro kernel: Call trace: Jul 15 10:20:18 m1pro kernel: __kmalloc_node_track_caller+0xec/0x2bc Jul 15 10:20:18 m1pro kernel: krealloc+0x9c/0x144 Jul 15 10:20:18 m1pro kernel: _RINvNtCsKOPqOvr6FN_5alloc7raw_vec11finish_growNtNtB4_5alloc6GlobalECsirMamryJlsQ_5asahi+0x44/0xac [asahi] Jul 15 10:20:18 m1pro kernel: _RNvMs0_NtCsKOPqOvr6FN_5alloc3vecINtB5_3VechE21try_extend_from_sliceCsirMamryJlsQ_5asahi+0xc8/0x13c [asahi] Jul 15 10:20:18 m1pro kernel: _RINvMs8_NtCsirMamryJlsQ_5asahi6objectINtB6_9GpuObjectNtNtNtB8_2fw6vertex17RunVertexG13V13_5INtNtB8_5alloc12GenericAllocBP_NtB1u_14HeapAllocationEE17new_init_preallocINtNtNtCsc1LFWrxnNA7_6kernel4init10___internal11InitClosureNCNCNvMs1_NtN> Jul 15 10:20:18 m1pro kernel: _RNvMs1_NtNtCsirMamryJlsQ_5asahi5queue6renderNtB7_18QueueInnerG13V13_513submit_render+0x1ba8/0x1dd0 [asahi] Jul 15 10:20:18 m1pro kernel: _RNvXsK_NtCsirMamryJlsQ_5asahi5queueNtB5_13QueueG13V13_5NtB5_5Queue6submit+0xf74/0x1578 [asahi] Jul 15 10:20:18 m1pro kernel: _RNvNvXs_NtCsirMamryJlsQ_5asahi6driverNtB6_11AsahiDriverNtNtNtCsc1LFWrxnNA7_6kernel3drm3drv6Driver6IOCTLS12ASAHI_SUBMIT+0x648/0x840 [asahi] Jul 15 10:20:18 m1pro kernel: drm_ioctl_kernel+0xd4/0x13c Jul 15 10:20:18 m1pro kernel: drm_ioctl+0x23c/0x4e4 Jul 15 10:20:18 m1pro kernel: __arm64_sys_ioctl+0xc0/0x118 Jul 15 10:20:18 m1pro kernel: invoke_syscall.constprop.0+0x50/0x124 Jul 15 10:20:18 m1pro kernel: do_el0_svc+0x40/0xf0 Jul 15 10:20:18 m1pro kernel: el0_svc+0x34/0x11c Jul 15 10:20:18 m1pro kernel: el0t_64_sync_handler+0x140/0x14c Jul 15 10:20:18 m1pro kernel: el0t_64_sync+0x190/0x194 Jul 15 10:20:18 m1pro kernel: Code: 54000c20 b9402a82 aa1703e1 aa1403e0 (f8626af9) Jul 15 10:20:18 m1pro kernel: ---[ end trace 0000000000000000 ]--- Jul 15 10:20:18 m1pro kernel: Unable to handle kernel paging request at virtual address 006120492079636d Jul 15 10:20:18 m1pro kernel: Mem abort info: Jul 15 10:20:18 m1pro kernel: ESR = 0x0000000096000004 Jul 15 10:20:18 m1pro kernel: EC = 0x25: DABT (current EL), IL = 32 bits Jul 15 10:20:18 m1pro kernel: SET = 0, FnV = 0 Jul 15 10:20:18 m1pro kernel: EA = 0, S1PTW = 0 Jul 15 10:20:18 m1pro kernel: FSC = 0x04: level 0 translation fault Jul 15 10:20:18 m1pro kernel: Data abort info: Jul 15 10:20:18 m1pro kernel: ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 Jul 15 10:20:18 m1pro kernel: CM = 0, WnR = 0, TnD = 0, TagAccess = 0 Jul 15 10:20:18 m1pro kernel: GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 Jul 15 10:20:18 m1pro kernel: [006120492079636d] address between user and kernel address ranges Jul 15 10:20:18 m1pro kernel: Internal error: Oops: 0000000096000004 [#2] PREEMPT SMP
Going back to 6.9.5 brings back a stable system.
Fix is in asahi-6.11-2 and later 6.11 stable kernels without issues resurfacing.
Since updating from 6.9.5 to to 6.9.6 (and 6.9.9) i get random gpu/drm related crashes after a few minutes of usage.
Going back to 6.9.5 brings back a stable system.