wwmm / easyeffects

Limiter, compressor, convolver, equalizer and auto volume and many other plugins for PipeWire applications
GNU General Public License v3.0
6.13k stars 264 forks source link

Drops and errors on system load #3224

Open Massimo-B opened 3 days ago

Massimo-B commented 3 days ago

EasyEffects Version

7.1.3

What package are you using?

Gentoo

Distribution

Gentoo 23.0

Describe the bug

On higher system loads I have drops and errors in pw-top when running easyeffects. This also happens when CPU is not fully used (about 50%), but with Chrome and a MS Teams session with video. Load AVG is about 20 with some btrfs workers in the back. The issue is usually solved when killing easyeffects. I also disabled the spectrum now as it also did a lot of load.

Expected Behavior

No response

Debug Log

S   ID  QUANT   RATE    WAIT    BUSY   W/Q   B/Q  ERR FORMAT           NAME                                                                                                                                                                                                                                                                                                                                                               
S   29      0      0    ---     ---   ---   ---     0                  Dummy-Driver
S   30      0      0    ---     ---   ---   ---     0                  Freewheel-Driver
R   75   1024  48000  31,9us   0,5us  0,00  0,00    2    S16LE 1 48000 alsa_input.usb-Microsoft_Microsoft___LifeCam_HD-3000-02.mono-fallback
R   92      1     25  19,4us   6,1us  0,00  0,00    6       F32LE 1 25  + PulseAudio Volume Control
R   76   1024  48000  73,5us 105,6us  0,00  0,00  550    S32LE 2 48000 alsa_output.pci-0000_00_1b.0.analog-stereo
R  124      1     25  28,9us  28,9us  0,00  0,00    8       F32LE 1 25  + PulseAudio Volume Control
R   77   1024  48000  65,8us   0,9us  0,00  0,00    2    S32LE 2 48000 alsa_input.pci-0000_00_1b.0.analog-stereo
R  128      1     25  34,3us  22,2us  0,00  0,00   10       F32LE 1 25  + PulseAudio Volume Control
I  269      0      0   0,0us   0,0us  ???   ???     0                  ee_test_signals
R   74   1024  48000  15,2ms   1,6us  0,71  0,00  169    S16LE 2 48000 alsa_input.usb-Samson_Technologies_Samson_C01U-00.analog-stereo
R   35      0      0   8,3us  12,7us  0,00  0,00    5     F32P 2 48000  + easyeffects_sink
R   36      0      0  33,9us  13,1us  0,00  0,00  202     F32P 2 48000  + easyeffects_source
R   33      0      0   4,5us 260,4us  0,00  0,01  8772                   + ee_soe_output_level
R   41      0      0   5,1ms   1,4ms  0,24  0,07  12328                   + ee_soe_spectrum
R   52      0      0   3,1us 263,0us  0,00  0,01  6256                   + ee_sie_output_level
R   57      0      0   3,1us   1,3ms  0,00  0,06  13722                   + ee_sie_spectrum
R   89      1     25  78,2us  20,3us  0,00  0,00    7       F32LE 1 25  + PulseAudio Volume Control
R   93      1     25  60,3us  12,0us  0,00  0,00  128       F32LE 1 25  + PulseAudio Volume Control
R  239      1     25  87,5us   8,6us  0,00  0,00    5       F32LE 1 25  + PulseAudio Volume Control
R   95      0      0   9,3us  52,4us  0,00  0,00   58    S32LE 2 48000  + alsa_output.usb-GuangZhou_FiiO_Electronics_Co._Ltd_FiiO_K5_Pro-00.analog-stereo
R  147      1     25  28,7us   9,2us  0,00  0,00   54       F32LE 1 25  + PulseAudio Volume Control
R  212      1     25  53,0us  11,4us  0,00  0,00    6       F32LE 1 25  + PulseAudio Volume Control
R  177      0      0  34,3us   5,2ms  0,00  0,24  2341                   + ee_sie_rnnoise
R  218      0      0  15,6us  12,2us  0,00  0,00  2090                   + ee_sie_echo_canceller
R  130      0      0   3,3us   7,4us  0,00  0,00  796                   + ee_sie_speex
R  185      0      0   4,0us   9,2us  0,00  0,00  2470                   + ee_sie_filter
R  138      0      0   2,8us 420,4us  0,00  0,02  1458                   + ee_sie_bass_enhancer
R  120      0      0   3,5us   6,5us  0,00  0,00  666                   + ee_sie_maximizer
R  207      0      0   3,3us   5,2ms  0,00  0,25  5255                   + ee_sie_crystalizer
R   78      0      0   4,8us 453,7us  0,00  0,02  2286                   + ee_sie_exciter
R  188      0      0   3,4us 458,1us  0,00  0,02  1202                   + ee_sie_stereo_tools
R  168      0      0   4,3us   7,4us  0,00  0,00  167                   + ee_sie_delay
R  267    512  48000  30,9us  18,2us  0,00  0,00    1    F32LE 2 48000  + Google
S   ID  QUANT   RATE    WAIT    BUSY   W/Q   B/Q  ERR FORMAT           NAME                                                                                                                                                                                                                                                                                                                                                               
S   29      0      0    ---     ---   ---   ---     0                  Dummy-Driver
S   30      0      0    ---     ---   ---   ---     0                  Freewheel-Driver
R   75    512  48000 106,9us   0,7us  0,01  0,00   19    S16LE 1 48000 alsa_input.usb-Microsoft_Microsoft___LifeCam_HD-3000-02.mono-fallback
R   92      1     25  70,0us   2,7us  0,01  0,00  154       F32LE 1 25  + PulseAudio Volume Control
R   77    512  48000  57,8us   0,7us  0,01  0,00   20    S32LE 2 48000 alsa_input.pci-0000_00_1b.0.analog-stereo
R  128      1     25  28,0us   9,0us  0,00  0,00  222       F32LE 1 25  + PulseAudio Volume Control
R   74    512  48000   5,6ms   1,1us  0,53  0,00  7575    S16LE 2 48000 alsa_input.usb-Samson_Technologies_Samson_C01U-00.analog-stereo
R   76      0      0  26,4us  60,6us  0,00  0,01  5553    S32LE 2 48000  + alsa_output.pci-0000_00_1b.0.analog-stereo
R  124      1     25  25,5us  19,4us  0,00  0,00  119       F32LE 1 25  + PulseAudio Volume Control
R  212      1     25  41,4us   5,2us  0,00  0,00  150       F32LE 1 25  + PulseAudio Volume Control
R  298      0      0  15,1us   4,2us  0,00  0,00   25     F32P 2 48000  + easyeffects_sink
R  123      0      0  23,1us  13,4us  0,00  0,00  5473     F32P 2 48000  + easyeffects_source
R   72      0      0   3,5us 128,6us  0,00  0,01  2022                   + ee_soe_output_level
R  127      0      0   1,4ms  10,5us  0,13  0,00  3687                   + ee_soe_spectrum
R  234      0      0   3,7us 129,4us  0,00  0,01  11492                   + ee_sie_output_level
R  136      0      0   4,4us   7,2us  0,00  0,00  1358                   + ee_sie_spectrum
R   57      1     25   5,6us   3,4us  0,00  0,00   28       F32LE 1 25  + PulseAudio Volume Control
R  113      1     25  16,9us   7,9us  0,00  0,00  502       F32LE 1 25  + PulseAudio Volume Control
R  226      0      0  41,4us   1,4ms  0,00  0,13  7664                   + ee_sie_rnnoise
R  209      0      0  24,2us  13,1us  0,00  0,00  1717                   + ee_sie_echo_canceller
R  155      0      0   5,0us   7,4us  0,00  0,00  681                   + ee_sie_speex
R  227      0      0   4,2us   6,8us  0,00  0,00  407                   + ee_sie_filter
R  224      0      0   3,3us 224,3us  0,00  0,02  2283                   + ee_sie_bass_enhancer
R  211      0      0   3,1us   6,2us  0,00  0,00  180                   + ee_sie_maximizer
R  343      0      0   3,4us   2,9ms  0,00  0,27  32625                   + ee_sie_crystalizer
R  170      0      0   6,0us 289,1us  0,00  0,03  15473                   + ee_sie_exciter
R  114      0      0   4,7us 197,4us  0,00  0,02  13265                   + ee_sie_stereo_tools
R   52      0      0  10,1us   7,7us  0,00  0,00  1401                   + ee_sie_delay
R  379    480  48000  25,9us  23,7us  0,00  0,00  175    S16LE 2 48000  + Google Chrome input
R  255    512  48000  47,8us   7,6us  0,00  0,00   35    F32LE 2 48000  + Google Chrome
R  363      1     25   4,0us   2,9us  0,00  0,00    5       F32LE 1 25  + PulseAudio Volume Control
S  259      0      0    ---     ---   ---   ---     0                  ee_test_signals

Additional Information

$ chrt -a -p `pidof easyeffects`
pid 18254's current scheduling policy: SCHED_OTHER
pid 18254's current scheduling priority: 0
pid 18261's current scheduling policy: SCHED_OTHER
pid 18261's current scheduling priority: 0
pid 18262's current scheduling policy: SCHED_OTHER
pid 18262's current scheduling priority: 0
pid 18263's current scheduling policy: SCHED_OTHER
pid 18263's current scheduling priority: 0
pid 18264's current scheduling policy: SCHED_OTHER
pid 18264's current scheduling priority: 0
pid 18265's current scheduling policy: SCHED_OTHER
pid 18265's current scheduling priority: 0
pid 18267's current scheduling policy: SCHED_BATCH
pid 18267's current scheduling priority: 0
pid 18268's current scheduling policy: SCHED_OTHER
pid 18268's current scheduling priority: 0
pid 18270's current scheduling policy: SCHED_OTHER
pid 18270's current scheduling priority: 0
pid 18271's current scheduling policy: SCHED_FIFO|SCHED_RESET_ON_FORK
pid 18271's current scheduling priority: 83
pid 18413's current scheduling policy: SCHED_OTHER
pid 18413's current scheduling priority: 0
pid 18432's current scheduling policy: SCHED_OTHER
pid 18432's current scheduling priority: 0
pid 18506's current scheduling policy: SCHED_OTHER
pid 18506's current scheduling priority: 0
pid 18799's current scheduling policy: SCHED_OTHER
pid 18799's current scheduling priority: 0
pid 18895's current scheduling policy: SCHED_OTHER
pid 18895's current scheduling priority: 0
Massimo-B commented 3 days ago

For higher loads I tried to increase quantum. Sometimes that helped in the past while accepting a higher latency, but with easyeffects it seems to get worse after I trying to tinker with quantum like pw-metadata -n settings 0 clock.force-quantum 1024 or pw-metadata -n settings 0 clock.force-quantum 2048

wwmm commented 2 days ago

The error column in pw-top indicate xrun errors. In most cases a xrun means that the soundcard is not receiving audio buffers as fast as it would like. In our case this means that for some reason PipeWire is not being able to do it. PipeWire/Pulseaudio apps send buffers to the server and not directly to the soundcard. So for some reason the additional load in PipeWire's realtime thread is being too much for it to handle on your system.

There isn't really much that can be done from the client side. The obvious solution would be to make the plugins code super fast. But besides the fact most of the ones we use come from third party projects they are probably already as optimized as they can be.

Something to try would be to change some server configuration options like ALSA Headroom or changing kernel versions. This has helped some users in the past. It is also worth to identify which plugins contribute the most to the xrun errors and avoiding using them in this machine.

As xrun errors depending strongly on the hardware it is hard to suggest a direct solution. For example on my current desktop even putting all the cores of my Ryzen 7700 under 100% stress test I have zero errors in pw-top while watching youtube videos on Firefox.

wwmm commented 2 days ago

Something I have been doing for years is booting with the kernel option threadirqs. Maybe this is helping somehow to play audio at higher loads. Does it make any difference?

tleb commented 2 days ago

Hi!

There isn't really much that can be done from the client side. The obvious solution would be to make the plugins code super fast. But besides the fact most of the ones we use come from third party projects they are probably already as optimized as they can be.

I've looked at spectrum code as that is the plugin with the most xrun and is implemented by EasyEffects.

What would you think about moving most of the processing to outside the realtime thread? Currently it does the DFT inside the RT thread and hands off a mono double buffer when done.

The idea would be that it would do the left+right average, put that to the end of a buffer and be done for the realtime thread. When the spectrum needs to be rendered, then this is copied and worked on from the thread doing the rendering.

Note: one issue of the current approach is that the DFT is done per process event. This might be much more frequent than needed if the PW quantum is really small. In that case, we would reduce work done in the RT thread but also overall CPU load by avoiding work.

Do you have any thoughts on that? I'll be able to work on that.

A tengant: how expensive are the util::idle_add() calls? There is one at the end of setup() (called when sample rate or quantum changes) and one at the end of process(). I see an allocation then a call to g_idle_add(). That call, I do not know about.

wwmm commented 2 days ago

What would you think about moving most of the processing to outside the realtime thread?

I updated our master branch now moving the fft call to the main thread. Let's see if this helps weaker processors to handle the extra load. The thing is that all this load was already avoided when the window was hidden. So unless all those people having xruns are with EE window always opened I do not expect much change.

A tengant: how expensive are the util::idle_add() calls? There is one at the end of setup() (called when sample rate or quantum changes) and one at the end of process(). I see an allocation then a call to g_idle_add(). That call, I do not know about.

g_idle_add schedules the execution of a function in glib/gtk main thread.

tleb commented 2 days ago

Ah, nice! Thanks.

g_idle_add schedules the execution of a function in glib/gtk main thread.

This sounds like something that requires synchronization and might even be blocking. Nothing great in the audio processing codepath. I see it gets used by most plugins to export results.

wwmm commented 2 days ago

Ah, nice! Thanks.

g_idle_add schedules the execution of a function in glib/gtk main thread.

This sounds like something that requires synchronization and might even be blocking. Nothing great in the audio processing codepath. I see it gets used by most plugins to export results.

It does not require blocking or synchronization. And we must use it because of the usual requirement from graphical toolkits about not having other threads messing with widgets. Sooner or later a move to the main thread will have to be done.

tleb commented 2 days ago

It does not require blocking or synchronization.

The underlying implementation is idle_add_full (code). I'm counting two malloc calls in idle_source_new, one in g_source_set_callback and a mutex lock in g_source_attach.

I'm wondering if putting data in the plugin and letting the main thread access it, without creating a new idle task on each process() call, wouldn't be more efficient? I'd be down to attempt a proof-of-concept if you want.

wwmm commented 2 days ago

I'm wondering if putting data in the plugin and letting the main thread access it, without creating a new idle task on each process() call, wouldn't be more efficient? I'd be down to attempt a proof-of-concept if you want.

Besides the fact you would have to rewrite considerable amounts of EE code because what you propose is the opposite of what is done everywhere in EasyEffects code I am skeptical about the performance gains being worth of such a radical change. Looking at perf top output the calls to g_idle_add are not a bottleneck.

And like I said before when the window is hidden (EE in background) none of this code is operational. But some machines will still have xrun. So it is unlikely that the calls to g_idle_add are the source of problem.

wwmm commented 2 days ago

I'm wondering if putting data in the plugin and letting the main thread access it, without creating a new idle task on each process() call, wouldn't be more efficient?

This would probably require to put the main thread in some kind of polling mode. Possibly involving the insertion of our own mutex objects because now we have to worry about what is being done to the data structures inside the plugins. It does not seem very appealing.