AsahiLinux / linux

Linux kernel source tree
Other
2.36k stars 90 forks source link

GPU slowness accompanied by "asahi: WorkQueue: Cannot submit, but queue is empty?" #324

Closed xal-0 closed 1 month ago

xal-0 commented 2 months ago

Overview

I do not know how to reproduce the problem reliably at this point, but this is the pattern:

  1. I hit the shortcut for krunner
  2. For 1-15 seconds, krunner does not appear. I get maybe 2 frames per second while this is happening. During this time:
    • An asahi_sched kworker pins a cpu to 100%
    • About 30 messages per second are logged to dmesg (rest of the messages):
      [14408.444527] asahi: WorkQueue: Cannot submit, but queue is empty? 0 > 1, 1279 > 2 (pend=0 ls=Some(
                 EventValue(
                     0x3a8f00,
                 ),
             ) lc=Some(
                 EventValue(
                     0x3a1000,
                 ),
             ))
      [14408.473324] asahi: WorkQueue: Cannot submit, but queue is empty? 1 > 1, 1279 > 2 (pend=0 ls=Some(
                 EventValue(
                     0x3a8f00,
                 ),
             ) lc=Some(
                 EventValue(
                     0x3a1100,
                 ),
             ))
  3. At some point, the problem clears and krunner opens like normal.

I reckon it happens maybe 10% of the times I open krunner, but I can't get it to happen by killing and relaunching it many times in a row.

Perf

Here's what I got from perf when it was happening one time:

    66.90%  QSGRenderThread  [kernel.kallsyms]              [k] mutex_spin_on_owner
            |
            ---mutex_spin_on_owner
               __mutex_lock_slowpath
               mutex_lock
               |          
               |--61.42%--<asahi[c4d2e9c82ab194e]::queue::SubQueueJobG14V13_5>::get
               |          |          
               |          |--32.53%--<asahi[c4d2e9c82ab194e]::queue::QueueInnerG14V13_5>::submit_render
               |          |          <asahi[c4d2e9c82ab194e]::queue::QueueG14V13_5 as asahi[c4d2e9c82ab194e]::queue::Queue>::submit
               |          |          <asahi[c4d2e9c82ab194e]::driver::AsahiDriver as kernel[2bea4e1ed0af1e4]::drm::drv::Driver>::IOCTLS::ASAHI_SUBMIT
               |          |          drm_ioctl_kernel
               |          |          drm_ioctl
               |          |          __arm64_sys_ioctl
               |          |          invoke_syscall.constprop.0
               |          |          do_el0_svc
               |          |          el0_svc
               |          |          el0t_64_sync_handler
               |          |          el0t_64_sync
               |          |          __GI___ioctl
               |          |          drmIoctl
               |          |          agx_batch_submit
               |          |          agx_batch_submit
               |          |          
               |           --28.88%--<asahi[c4d2e9c82ab194e]::queue::QueueG14V13_5 as asahi[c4d2e9c82ab194e]::queue::Queue>::submit
               |                     <asahi[c4d2e9c82ab194e]::driver::AsahiDriver as kernel[2bea4e1ed0af1e4]::drm::drv::Driver>::IOCTLS::ASAHI_SUBMIT
               |                     drm_ioctl_kernel
               |                     drm_ioctl
               |                     __arm64_sys_ioctl
               |                     invoke_syscall.constprop.0
               |                     do_el0_svc
               |                     el0_svc
               |                     el0t_64_sync_handler
               |                     el0t_64_sync
               |                     __GI___ioctl
               |                     drmIoctl
               |                     agx_batch_submit
               |                     agx_batch_submit
               |          
               |--3.68%--<asahi[c4d2e9c82ab194e]::gpu::GpuManagerG14V13_5 as asahi[c4d2e9c82ab194e]::gpu::GpuManager>::alloc
               |          |          
               |           --3.28%--<asahi[c4d2e9c82ab194e]::queue::QueueInnerG14V13_5>::submit_render
               |                     <asahi[c4d2e9c82ab194e]::queue::QueueG14V13_5 as asahi[c4d2e9c82ab194e]::queue::Queue>::submit
               |                     <asahi[c4d2e9c82ab194e]::driver::AsahiDriver as kernel[2bea4e1ed0af1e4]::drm::drv::Driver>::IOCTLS::ASAHI_SUBMIT
               |                     drm_ioctl_kernel
               |                     drm_ioctl
               |                     __arm64_sys_ioctl
               |                     invoke_syscall.constprop.0
               |                     do_el0_svc
               |                     el0_svc
               |                     el0t_64_sync_handler
               |                     el0t_64_sync
               |                     __GI___ioctl
               |                     drmIoctl
               |                     agx_batch_submit
               |                     agx_batch_submit
               |          
                --1.80%--<asahi[c4d2e9c82ab194e]::queue::SubQueueJobG13V12_3>::commit
                          <asahi[c4d2e9c82ab194e]::queue::QueueJobG13V12_3>::commit
                          <asahi[c4d2e9c82ab194e]::queue::QueueG14V13_5 as asahi[c4d2e9c82ab194e]::queue::Queue>::submit
                          <asahi[c4d2e9c82ab194e]::driver::AsahiDriver as kernel[2bea4e1ed0af1e4]::drm::drv::Driver>::IOCTLS::ASAHI_SUBMIT
                          drm_ioctl_kernel
                          drm_ioctl
                          __arm64_sys_ioctl
                          invoke_syscall.constprop.0
                          do_el0_svc
                          el0_svc
                          el0t_64_sync_handler
                          el0t_64_sync
                          __GI___ioctl
                          drmIoctl
                          agx_batch_submit
                          agx_batch_submit

Version info

argonaut0 commented 2 months ago

I can confirm that I also run into this on my machine.

Didn't investigate until one time it froze the entire screen completely and I had to reboot by holding down the power button.

I don't think I was using krunner at the time, just tabbing between windows.

Reading journalctl -r after the fact logged these messages many times a second. I also had these ones about key repeat discard, unsure if related or just caused by the slowness.

-- Boot 072b439fe87e4cd1bfacd1d7cf5e2e49 --
Sep 10 23:15:45 macbook kernel: macsmc-hid macsmc-hid: Triggering forced shutdown!
Sep 10 23:15:45 macbook kernel: macsmc-rtkit 23e400000.smc: RTKit: syslog message: apComms.cpp:375: SMC HID Event: 01 00 01
Sep 10 23:15:45 macbook kwin_wayland_wrapper[1724]: Key repeat discarded, Wayland compositor doesn't seem to be processing events fast enough!
Sep 10 23:15:45 macbook kernel: asahi: WorkQueue: Cannot submit, but queue is empty? 1 > 1, 1279 > 2 (pend=0 ls=Some(
                                    EventValue(
                                        0x3cd00,
                                    ),
                                ) lc=Some(
                                    EventValue(
                                        0x34f00,
                                    ),
                                ))
Sep 10 23:15:45 macbook kernel: asahi: WorkQueue: Cannot submit, but queue is empty? 1 > 1, 1279 > 2 (pend=0 ls=Some(
                                    EventValue(
                                        0x3cd00,
                                    ),
                                ) lc=Some(
                                    EventValue(
                                        0x34f00,
                                    ),
                                ))
Sep 10 23:15:45 macbook kwin_wayland_wrapper[1724]: Key repeat discarded, Wayland compositor doesn't seem to be processing events fast enough!
Sep 10 23:15:45 macbook kernel: asahi: WorkQueue: Cannot submit, but queue is empty? 1 > 1, 1279 > 2 (pend=0 ls=Some(
                                    EventValue(
                                        0x3cc00,
                                    ),
                                ) lc=Some(
                                    EventValue(
                                        0x34e00,
                                    ),
                                ))

Other incidental error messages logged:

Sep 10 23:15:38 macbook kwin_wayland[1611]: kwin_libinput: Libinput: event0  - Apple SMC power/lid events: client bug: event processing lagging behind by 28ms, your system is too slow
Sep 10 23:15:27 macbook kwin_wayland[1611]: kwin_libinput: Libinput: event1  - Apple MTP keyboard: client bug: event processing lagging behind by 24ms, your system is too slow
Sep 10 23:08:59 macbook kwin_wayland[1611]: kwin_libinput: Libinput: event2  - Apple MTP multi-touch: kernel bug: Touch jump detected and discarded.
                                            See https://wayland.freedesktop.org/libinput/doc/1.26.1/touchpad-jumping-cursors.html for details
(^funnily this seems to be a dead link)

kernel 6.10.6-asahi Mesa 24.2.0-devel kde plasma 6.1.4

(lol hi sam)

mogery commented 2 months ago

Same issue here! It causes huge system-level stutters for me (audio, mouse moves slowly, plasma frame drops).

[38112.618826] asahi: WorkQueue: Cannot submit, but queue is empty? 0 > 1, 1279 > 2 (pend=0 ls=Some(
                   EventValue(
                       0x19df00,
                   ),
               ) lc=Some(
                   EventValue(
                       0x196000,
                   ),
               ))
[38112.646744] asahi: WorkQueue: Cannot submit, but queue is empty? 0 > 1, 1279 > 2 (pend=0 ls=Some(
                   EventValue(
                       0x19d000,
                   ),
               ) lc=Some(
                   EventValue(
                       0x195100,
                   ),
               ))
[38112.675567] asahi: WorkQueue: Cannot submit, but queue is empty? 0 > 1, 1279 > 2 (pend=0 ls=Some(
                   EventValue(
                       0x19e000,
                   ),
               ) lc=Some(
                   EventValue(
                       0x196100,
                   ),
               ))
xal-0 commented 1 month ago

The problems seems to have disappeared in a recent version of either linux-asahi, mesa-asahi, or KDE:

These versions aren't in nixos-apple-silicon yet. Could someone else confirm if the problem is fixed?

alyssarosenzweig commented 1 month ago

I believe Lina fixed this