canonical / mir

The Mir compositor
GNU General Public License v2.0
621 stars 99 forks source link

Excessive CPU usage #3230

Open AlanGriffiths opened 6 months ago

AlanGriffiths commented 6 months ago

There's intermittent high CPU usage of Miriway, seen across stable (Mir 2.16.3) beta and edge (Mir/main)

So far, I've only seen it when docked to evdi (DisplayLink) outputs. And there's another report against Miracle-mw

At first it seemed very consistent when switching between versions, but I've now been running for over a week without problems with the version I first saw it on.


[Original report] There's a big difference in CPU usage between miriway/{stable, beta} and miriway/edge when docked using DisplayLink and my usual collection of comms and development apps open on various workspaces.

With stable, miriway-shell idles at around 1%CPU, (with spikes up to around 100%) With edge, miriway-shell idles at around 35%CPU, (with spikes up to around 190%)

Likely to be Mir as the difference is between Mir/2.16.3 and Mir/main

Saviq commented 6 months ago

Looking at the diff the only two that feel like they could be it?:

Alternatively, dependency changes? Could rebuild https://launchpad.net/~mir-team/+snap/miriway-beta as a data point.

AlanGriffiths commented 6 months ago

I don't know why you'd pick those. In particular, https://github.com/canonical/mir/pull/3203 was cherry-picked into 2.16.3.

There are a lot that could have an effect. These look plausible:

AlanGriffiths commented 6 months ago

Could rebuild https://launchpad.net/~mir-team/+snap/miriway-beta as a data point.

Yeah, seems like a long shot, but worth confirming. (Odd that this doesn't show any builds - edge does)

OTOH, I've installed the rebuilt beta and now have the high CPU usage again.

AlanGriffiths commented 6 months ago

I have also tried #3231 (which shows the same symptom).

I've also tried killall yambar - which on the first attempt appeared to mitigate the behaviour, but I couldn't then reproduce.

Attempting to kill swaybg doesn't help - it restarts (or Miriway restarts it)

AlanGriffiths commented 6 months ago

And now I'm on stable with the same symptom.

Update: while I was writing the previous sentence, usage suddenly dropped to 1.7% and stayed in single digits.

AlanGriffiths commented 6 months ago

And now, on edge/pr88 I see the same thing - a few minutes of high CPU followed by a drop to something sane.

When I reported this, the problem was persistent beyond a few minutes.

It now seems that regardless of the channel there is a chance of getting into this "high CPU" mode, and a chance of getting out of it again. Maybe the difference between stable and other branches is luck, or a change in the odds.

(There is nothing in the logs to suggest what Miriway is doing with the CPU)

RAOF commented 6 months ago

Have you tried attaching perf record during a period of high CPU use too see what is eating CPU?

AlanGriffiths commented 6 months ago

Have you tried attaching perf record during a period of high CPU use too see what is eating CPU?

No, not "got serious" yet. So far just working on reproducing (and hoping it was a recent change that could be spotted).

AlanGriffiths commented 6 months ago

Compositing seems to be "the thing"...

   5.89%  Mir/Comp       [kernel.kallsyms]            [k] 0xffffffffb24a5067
   5.77%  Mir/Comp       libc.so.6                    [.] 0x000000000017d9a7
   3.72%  Mir/Comp       libc.so.6                    [.] 0x000000000017d997
   3.59%  Mir/Comp       [kernel.kallsyms]            [k] 0xffffffffb17e7b90
   3.33%  Mir/Comp       [drm]                        [k] drm_clflush_page
   3.08%  Mir/Comp       libc.so.6                    [.] 0x000000000017d970
   2.96%  Mir/Comp       [kernel.kallsyms]            [k] 0xffffffffb251cc9b
   2.55%  Mir/Comp       libc.so.6                    [.] 0x000000000017d960
   2.49%  Mir/Comp       [kernel.kallsyms]            [k] 0xffffffffb17bbc89
   2.04%  Mir/Comp       libc.so.6                    [.] 0x000000000017d94d
   1.80%  Mir/Comp       libc.so.6                    [.] 0x000000000017d956
   1.59%  Mir/Comp       [kernel.kallsyms]            [k] 0xffffffffb2601591
   1.58%  Mir/Comp       [kernel.kallsyms]            [k] 0xffffffffb2601149
   1.34%  Mir/Comp       [kernel.kallsyms]            [k] 0xffffffffb26011f7
   1.33%  Mir/Comp       [kernel.kallsyms]            [k] 0xffffffffb2601578
   1.28%  Mir/Comp       [kernel.kallsyms]            [k] 0xffffffffb17905a4
   1.10%  Mir/Comp       [kernel.kallsyms]            [k] 0xffffffffb181ba98
   0.94%  Mir/Comp       [kernel.kallsyms]            [k] 0xffffffffb2517f3e
   0.88%  Mir/Comp       libc.so.6                    [.] 0x000000000017d946
   0.85%  Mir/Comp       libc.so.6                    [.] 0x000000000017d93f
   0.71%  Mir/Comp       [drm]                        [k] drm_gem_put_pages
   0.67%  Mir/Comp       libc.so.6                    [.] 0x000000000017d923
   0.65%  Mir/Comp       [kernel.kallsyms]            [k] 0xffffffffb26011c6
   0.64%  Mir/Comp       libc.so.6                    [.] 0x000000000017d92a
   0.56%  Mir/Comp       [kernel.kallsyms]            [k] 0xffffffffb1818b2a
   0.39%  Mir/Comp       [evdi]                       [k] evdi_gem_fault
   0.38%  Mir/Comp       [i915]                       [k] fw_domains_get_with_fallback
   0.36%  miriway-shell  [kernel.kallsyms]            [k] 0xffffffffb251cc2b
...
AlanGriffiths commented 6 months ago

Or maybe not. I had high CPU again...

Samples: 39K of event 'cycles:P', Event count (approx.): 40956573981
Overhead  Command          Shared Object                                              Symbol
  70.62%  Mir/Input Reade  /snap/miriway/3011/usr/lib/x86_64-linux-gnu/libmiral.so.7  0x471dd            d [.] miral::Window::application() const                                                                   ◆
  13.87%  Mir/Input Reade  /snap/miriway/3011/usr/lib/x86_64-linux-gnu/libmiral.so.7  0x50bfd            d [.] 0x0000000000050bfd                                                                                   ▒
  11.41%  Mir/Input Reade  /snap/miriway/3011/usr/lib/x86_64-linux-gnu/libmiral.so.7  0x50bc5            d [.] 0x0000000000050bc5                                                                                   ▒
   0.79%  Mir/Input Reade  /snap/miriway/3011/usr/lib/x86_64-linux-gnu/libmiral.so.7  0x50b19            d [.] 0x0000000000050b19                                                                                   ▒
   0.75%  Mir/Input Reade  /snap/miriway/3011/usr/lib/x86_64-linux-gnu/libmiral.so.7  0x50b04            d [.] 0x0000000000050b04                                                                                   ▒
   0.52%  Mir/Input Reade  /snap/miriway/3011/usr/lib/x86_64-linux-gnu/libmiral.so.7  0x50b40            d [.] 0x0000000000050b40                                                                                   ▒
   0.50%  Mir/Input Reade  /snap/miriway/3011/usr/lib/x86_64-linux-gnu/libmiral.so.7  0x50b96            d [.] 0x0000000000050b96         
AlanGriffiths commented 6 months ago

Maybe there are two high CPU modes. I have just seen:

Samples: 29K of event 'cycles:P', Event count (approx.): 22326713788
Overhead  Command        Shared Object
   5.35%  Mir/Comp       /snap/core24/184/usr/lib/x86_64-linux-gnu/libc.so.6
   4.16%  Mir/Comp       [kernel.kallsyms]
   3.79%  Mir/Comp       /snap/core24/184/usr/lib/x86_64-linux-gnu/libc.so.6
   3.49%  Mir/Comp       /lib/modules/6.5.0-18-generic/kernel/drivers/gpu/drm/drm.ko
   3.18%  Mir/Comp       [kernel.kallsyms]
   2.89%  Mir/Comp       [kernel.kallsyms]
   1.99%  Mir/Comp       /snap/core24/184/usr/lib/x86_64-linux-gnu/libc.so.6
   1.86%  Mir/Comp       [kernel.kallsyms]
   1.62%  Mir/Comp       [kernel.kallsyms]
   1.60%  Mir/Comp       [kernel.kallsyms]
   1.49%  Mir/Comp       [kernel.kallsyms]
   1.49%  Mir/Comp       /snap/core24/184/usr/lib/x86_64-linux-gnu/libc.so.6
   1.34%  Mir/Comp       [kernel.kallsyms]
   1.30%  Mir/Comp       /lib/modules/6.5.0-18-generic/kernel/drivers/gpu/drm/i915/i915.ko
   1.26%  Mir/Comp       /snap/core24/184/usr/lib/x86_64-linux-gnu/libc.so.6
   1.15%  Mir/Comp       /snap/core24/184/usr/lib/x86_64-linux-gnu/libc.so.6
   1.02%  Mir/Comp       [kernel.kallsyms]
   1.02%  Mir/Comp       [kernel.kallsyms]
   0.93%  Mir/Comp       /snap/core24/184/usr/lib/x86_64-linux-gnu/libc.so.6
   0.92%  Mir/Comp       [kernel.kallsyms]
   0.82%  Mir/Comp       /snap/core24/184/usr/lib/x86_64-linux-gnu/libc.so.6
   0.74%  Mir/Comp       [kernel.kallsyms]
   0.62%  miriway-shell  [kernel.kallsyms]

Which look more like the first of the two profiles above

AlanGriffiths commented 6 months ago

OK, as discussed on https://github.com/canonical/mir/issues/3066#issuecomment-1966683008 this is (where I've seen it) mostly a problem with compositing on the DisplayLink driver being CPU intensive.

For my purposes (software development rather than gaming or video playback) an effective mitigation is to add composite-delay=60 to the .config.

AlanGriffiths commented 6 months ago

OK, there are two modes. With the compositing mitigation in place I again got:

Samples: 39K of event 'cycles:P', Event count (approx.): 42343509567
Overhead  Command          Shared Object                                              Symbol
  62.78%  Mir/Input Reade  /snap/miriway/3180/usr/lib/x86_64-linux-gnu/libmiral.so.7  0x471f6            d [.] miral::Window::application()
  29.23%  Mir/Input Reade  /snap/miriway/3180/usr/lib/x86_64-linux-gnu/libmiral.so.7  0x50bfd            d [.] 0x0000000000050bfd
   1.65%  Mir/Input Reade  /snap/miriway/3180/usr/lib/x86_64-linux-gnu/libmiral.so.7  0x50b2a            d [.] 0x0000000000050b2a
   1.63%  Mir/Input Reade  /snap/miriway/3180/usr/lib/x86_64-linux-gnu/libmiral.so.7  0x50ba3            d [.] 0x0000000000050ba3
   1.56%  Mir/Input Reade  /snap/miriway/3180/usr/lib/x86_64-linux-gnu/libmiral.so.7  0x50b40            d [.] 0x0000000000050b40
   1.46%  Mir/Input Reade  /snap/miriway/3180/usr/lib/x86_64-linux-gnu/libmiral.so.7  0x50b62            d [.] 0x0000000000050b62
   1.18%  Mir/Input Reade  /snap/miriway/3180/usr/lib/x86_64-linux-gnu/libmiral.so.7  0x50b0a            d [.] 0x0000000000050b0a
   0.31%  Mir/Input Reade  /snap/miriway/3180/usr/lib/x86_64-linux-gnu/libmiral.so.7  0x50b19            d [.] 0x0000000000050b19
   0.07%  Mir/Input Reade  /snap/miriway/3180/usr/lib/x86_64-linux-gnu/libmiral.so.7  0x50b04            d [.] 0x0000000000050b04
   0.04%  Mir/Input Reade  /snap/miriway/3180/usr/lib/x86_64-linux-gnu/libmiral.so.7  0x50c04            d [.] 0x0000000000050c04
   0.01%  Mir/Input Reade  /snap/miriway/3180/usr/lib/x86_64-linux-gnu/libmiral.so.7  0x50b47            d [.] 0x0000000000050b47
   0.01%  Mir/Input Reade  [kernel.kallsyms]                                          0xffffffff8f601149 ! [k] 0xffffffff8f601149
   0.01%  Mir/Input Reade  [kernel.kallsyms]                                          0xffffffff8f6011f7 ! [k] 0xffffffff8f6011f7
   0.01%  Mir/Input Reade  /snap/miriway/3180/usr/lib/x86_64-linux-gnu/libmiral.so.7  0x50b0f            d [.] 0x0000000000050b0f
   0.00%  Mir/Input Reade  [kernel.kallsyms]                                          0xffffffff8f51cc9b ! [k] 0xffffffff8f51cc9b
   0.00%  Mir/Input Reade  [kernel.kallsyms]                                          0xffffffff8e604000 ! [k] 0xffffffff8e604000
   0.00%  Mir/Input Reade  /snap/miriway/3180/usr/lib/x86_64-linux-gnu/libmiral.so.7  0x50b2f            d [.] 0x0000000000050b2f
   0.00%  Mir/Input Reade  [kernel.kallsyms]                                          0xffffffff8e571a37 ! [k] 0xffffffff8e571a37
   0.00%  Mir/Input Reade  [kernel.kallsyms]                                          0xffffffff8e4bb564 ! [k] 0xffffffff8e4bb564
   0.00%  Mir/Input Reade  [kernel.kallsyms]                                          0xffffffff8e5c0875 ! [k] 0xffffffff8e5c0875
   0.00%  Mir/Input Reade  [kernel.kallsyms]                                          0xffffffff8f600eb0 ! [k] 0xffffffff8f600eb0
   0.00%  Mir/Input Reade  [kernel.kallsyms]                                          0xffffffff8e58e17b ! [k] 0xffffffff8e58e17b
   0.00%  Mir/Input Reade  [kernel.kallsyms]                                          0xffffffff8e4a0475 ! [k] 0xffffffff8e4a0475
   0.00%  Mir/Input Reade  [kernel.kallsyms]                                          0xffffffff8e565335 ! [k] 0xffffffff8e565335
   0.00%  Mir/Input Reade  [kernel.kallsyms]                                          0xffffffff8e551815 ! [k] 0xffffffff8e551815
   0.00%  Mir/Input Reade  [kernel.kallsyms]                                          0xffffffff8f6011c9 ! [k] 0xffffffff8f6011c9
   0.00%  Mir/Input Reade  [kernel.kallsyms]                                          0xffffffff8e5ec409 ! [k] 0xffffffff8e5ec409
   0.00%  Mir/Input Reade  [kernel.kallsyms]                                          0xffffffff8e5428fa ! [k] 0xffffffff8e5428fa
   0.00%  Mir/Input Reade  [kernel.kallsyms]                                          0xffffffff8f517f3e ! [k] 0xffffffff8f517f3e
   0.00%  Mir/Input Reade  [kernel.kallsyms]                                          0xffffffff8e5f04cf ! [k] 0xffffffff8e5f04cf
   0.00%  Mir/Input Reade  [kernel.kallsyms]                                          0xffffffff8e57128b ! [k] 0xffffffff8e57128b
   0.00%  Mir/Input Reade  [kernel.kallsyms]                                          0xffffffff8e5327dd ! [k] 0xffffffff8e5327dd
   0.00%  Mir/Input Reade  [kernel.kallsyms]                                          0xffffffff8e5eee40 ! [k] 0xffffffff8e5eee40
   0.00%  Mir/Input Reade  [kernel.kallsyms]                                          0xffffffff8e5327d9 ! [k] 0xffffffff8e5327d9
   0.00%  Mir/Input Reade  /snap/miriway/3180/usr/lib/x86_64-linux-gnu/libmiral.so.7  0x50b1e            d [.] 0x0000000000050b1e
   0.00%  Mir/Input Reade  [kernel.kallsyms]                                          0xffffffff8e5428de ! [k] 0xffffffff8e5428de
   0.00%  Mir/Input Reade  /snap/miriway/3180/usr/lib/x86_64-linux-gnu/libmiral.so.7  0x50b69            d [.] 0x0000000000050b69
   0.00%  Mir/Input Reade  [kernel.kallsyms]                                          0xffffffff8e54b9c9 ! [k] 0xffffffff8e54b9c9
   0.00%  Mir/Input Reade  [kernel.kallsyms]                                          0xffffffff8e76c0dd ! [k] 0xffffffff8e76c0dd
   0.00%  Mir/Input Reade  /snap/miriway/3180/usr/lib/x86_64-linux-gnu/libmiral.so.7  0x50bb0            d [.] 0x0000000000050bb0
   0.00%  Mir/Input Reade  [kernel.kallsyms]                                          0xffffffff8f601578 ! [k] 0xffffffff8f601578
   0.00%  Mir/Input Reade  [kernel.kallsyms]                                          0xffffffff8e771fce ! [k] 0xffffffff8e771fce
   0.00%  Mir/Input Reade  [kernel.kallsyms]                                          0xffffffff8e5c648e ! [k] 0xffffffff8e5c648e
   0.00%  Mir/Input Reade  [kernel.kallsyms]                                          0xffffffff8e4bb566 ! [k] 0xffffffff8e4bb566
   0.00%  Mir/Input Reade  [kernel.kallsyms]                                          0xffffffff8e45e671 ! [k] 0xffffffff8e45e671
   0.00%  Mir/Input Reade  [kernel.kallsyms]                                          0xffffffff8e45e67b ! [k] 0xffffffff8e45e67b
   0.00%  Mir/Input Reade  [kernel.kallsyms]                                          0xffffffff8e4ac136 ! [k] 0xffffffff8e4ac136
AlanGriffiths commented 5 months ago

I've tried switching back to GNOME for a while, and that definitely doesn't use as much CPU when using DisplayLink.

AlanGriffiths commented 3 months ago

A (new) manifestation:

After upgrading to 24.04, "DisplayLink 6.0", and using "Miriway 129-mir2.17.0 (4052)" running yambar bumps the miriway-shell CPU usage from around 3-4% to around 30-40%.