raspberrypi / linux

Kernel source tree for Raspberry Pi-provided kernel builds. Issues unrelated to the linux kernel should be posted on the community forum at https://forums.raspberrypi.com/
Other
11.08k stars 4.96k forks source link

kworker/u9:0-brcmf_wq/mmc1:0001:1 High CPU on WIFI usage. #5934

Open fableman73 opened 8 months ago

fableman73 commented 8 months ago

Describe the bug

RPi Zero W2 64 Bit. Setting up SMB and transfer a large file will result in 20-40% CPU load of kworker/u9:0-brcmf_wq/mmc1:0001:1

Steps to reproduce the behaviour

Using WIFI makes kworker/u9:0-brcmf_wq/mmc1:0001:1 use a lot of CPU

Device (s)

Raspberry Pi Zero 2 W

System

Raspberry Pi reference 2023-12-05 Generated using pi-gen, https://github.com/RPi-Distro/pi-gen, e484aa85818428c3af08de0ba1db5329dfc84143, stage2

Oct 17 2023 15:42:39 Copyright (c) 2012 Broadcom version 30f0c5e4d076da3ab4f341d88e7d505760b93ad7 (clean) (release) (start)

Linux raspberrypi 6.1.74-v8+ #1725 SMP PREEMPT Mon Jan 22 13:35:32 GMT 2024 aarch64 GNU/Linux

Logs

No response

Additional context

No response

AnMakc commented 8 months ago

Same issue with Raspberry pi 4B

$ cat /etc/rpi-issue
Raspberry Pi reference 2023-10-10
Generated using pi-gen, https://github.com/RPi-Distro/pi-gen, 962bf483c8f326405794827cce8c0313fd5880a8, stage2

$ vcgencmd version
Aug 10 2023 15:33:38 
Copyright (c) 2012 Broadcom
version 03dc77429335caee083e22ddc8eec09c07f12a7a (clean) (release) (start)

$ uname -a
Linux hass 6.1.0-rpi4-rpi-v8 #1 SMP PREEMPT Debian 1:6.1.54-1+rpt2 (2023-10-05) aarch64 GNU/Linux

There is a huge number of interrupts/sec visible in /proc/interrupts during wifi activity:

 37:   11525120          0          0          0     GICv2 158 Level     mmc1, mmc0

It averages at ~6000 interrupts/sec while downloading file via wifi at ~8MB/s to /dev/null. And only 100 interrupts/sec with download stopped and everything else unchanged.

pelwell commented 8 months ago

8MB/s is ~5300 1.5KB packets per second, so 6000 interrupts sounds reasonable.

AnMakc commented 8 months ago

8MB/s is ~5300 1.5KB packets per second, so 6000 interrupts sounds reasonable.

Makes sense. But there is still high CPU load by kworker/u9:3+brcmf_wq/mmc1:0001:1 which does not look quite right.

fableman73 commented 8 months ago

8MB/s is ~5300 1.5KB packets per second, so 6000 interrupts sounds reasonable.

Makes sense. But there is still high CPU load by kworker/u9:3+brcmf_wq/mmc1:0001:1 which does not look quite right.

Agree the kworker/u9:0-brcmf_wq/mmc1:0001:1 should not act like this because of WIFI activity. Some IO handling must be done wrong. For me this is crucial to fix, the Zero W2 get very hot and super slow to use from normal WIFI file transfer.

someone posted this, and i am not sure if its related:

`Turns out the broadcom driver just doesn't work very well or something, the main issue is the host-wake interrupt. It generates millions of interrupts, however this dts parameter is optional.

So by removing

    // interrupt-parent = <&gpio1>;
    // interrupts = <13 IRQ_TYPE_LEVEL_LOW>; /* WL HOST WAKE COM pin 36*/
    // interrupt-names = "host-wake";

these fields from the device tree the cpu load disappears. And Wi-Fi seems to work fine afterwards.`

I don't know how to test this or where to look.

pelwell commented 8 months ago

That's a red herring the Zero 2 W doesn't use the BT_HOST_WAKE signal (it's not connected), and the DTS doesn't declare the interrupt source. The interrupts which are being detected are regular SDIO interrupts.

fableman73 commented 8 months ago

That's a red herring the Zero 2 W doesn't use the BT_HOST_WAKE signal (it's not connected), and the DTS doesn't declare the interrupt source. The interrupts which are being detected are regular SDIO interrupts.

Thanks for clearing that out.

Then I read this solution to the problem : I remove cap-sdio-irq in dts and fix cpu usage from 20% to 0.5%

Can that be relevant ?

pelwell commented 8 months ago

Not directly, no. Like many SDIO/MMC drivers, bcm2835-mmc sets MMC_CAP_SDIO_IRQ regardless. It is likely that disabling interrupt usage would reduce the system load, but the question is whether it will cause other problems - perhaps increased latency, perhaps worse.

fableman73 commented 8 months ago

but that last suggestion took 20% down to 0.5% sounds good at least.

pelwell commented 8 months ago

Running iperf while running top and pressing '1' shows the CPU usage breakdown by cores. I think the top level %CPU figure is in terms of a single core, not all 4, so I think the figures looks larger than it really is. You can confirm this by starting some CPU hogs:

pi@raspberrypi:~$ while true; do true; done &
[1] 960
pi@raspberrypi:~$ while true; do true; done &
[2] 961
pi@raspberrypi:~$ while true; do true; done &
[3] 962
pi@raspberrypi:~$ while true; do true; done &
[4] 963

Each one of these jobs will take 100% of one CPU, and with all 4 running it looks like this:

  966 pi        20   0    8580   2160   1280 R 100.0   0.5   0:17.98 bash                                 
  967 pi        20   0    8580   2160   1280 R 100.0   0.5   0:17.63 bash                                 
  965 pi        20   0    8580   2160   1280 R  99.7   0.5   0:18.39 bash                                 
  968 pi        20   0    8580   2160   1280 R  99.7   0.5   0:17.23 bash                                 

So you actually have 400% available. (you can shut them down with: kill %1 %2 %3 %4)

Fundamentally this kworker thread is just handling the interrupt load caused by such a large number of packets, and that is not going to change.

fableman73 commented 7 months ago

Lots of interrupt doesn't have to generate high CPU load , feels like very bad handling on interrupts / bad coding.

guy even removed this interuthandling from the kernel and compile there own and reduced it down to 0.5%

So someone really needs to look into this.. Now this is a RPi killer. Just imagine how easy to DDOS stuff built with Raspberry and the CPU will overload , that is not good design/coding/IO-.handling. Even small IOT boards have better handling than this.

Den fre 9 feb. 2024 kl 18:49 skrev Phil Elwell @.***>:

Running iperf while running top and pressing '1' shows the CPU usage breakdown by cores. I think the top level %CPU figure is in terms of a single core, not all 4, so I think the figures looks larger than it really is. You can confirm this by starting some CPU hogs:

@.:~$ while true; do true; done & [1] 960 @.:~$ while true; do true; done & [2] 961 @.:~$ while true; do true; done & [3] 962 @.:~$ while true; do true; done & [4] 963

Each one of these jobs will take 100% of one CPU, and with all 4 running it looks like this:

966 pi 20 0 8580 2160 1280 R 100.0 0.5 0:17.98 bash 967 pi 20 0 8580 2160 1280 R 100.0 0.5 0:17.63 bash 965 pi 20 0 8580 2160 1280 R 99.7 0.5 0:18.39 bash 968 pi 20 0 8580 2160 1280 R 99.7 0.5 0:17.23 bash

So you actually have 400% available.

(you can shut them down with: kill %1 %2 %3 %4)

Fundamentally this kworker thread is just handling the interrupt load caused by such a large number of packets, and that is not going to change.

— Reply to this email directly, view it on GitHub https://github.com/raspberrypi/linux/issues/5934#issuecomment-1936340258, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFAMWC327HPVPIGB34QH453YSZOSDAVCNFSM6AAAAABC6UQJZWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMZWGM2DAMRVHA . You are receiving this because you authored the thread.Message ID: @.***>

-- Yours sincerely Håkan Franzén

AnMakc commented 7 months ago

It's important to note that there is no excess CPU load on the same system if wired connection is used. It easily handles 10x throughput with no visible interrupt handling load.

Does'n it mean there is some inefficiency/bug in the driver implementation or configuration?

In my case it not only creates CPU load but also makes system irresponsible (1-5s terminal lags when using ssh).

fableman73 commented 7 months ago

I have the same lag using ssh and The CPU gets very warm and everything feel so slow.

fableman73 commented 7 months ago

I have a linux based router from Teltonika. I did a speed test from my laptop and pushed 380 Mbit and I noticed a small peek at 25% for a sort duration for the process: [napi/wwan%d-11]. (the [PCIe driver for MediaTek M.2 modem)

I was connected using WIFI 5Ghz to the router and then the router was connected with LTE 5G network. Soo 380Mbit on WIFI and then 380Mbit on LTE/5G at the same time.

Giving less CPU load and no lag at all. using the SSH even when the hardware used two interfaces pushing 380Mbit, didn't see any crazy number of interrupts. I understand its made for network traffic but it show that the RPi must have something wrong when handling the WIFI network.

pelwell commented 7 months ago

You can't seriously compare a PCIe-attached PC-class WiFi controller with one on the end of a 4-bit-wide SDIO bus.

pelwell commented 7 months ago

Do you want to try again with a better tone?

AnMakc commented 7 months ago

@pelwell Did I understand correctly that this considered expected behaviour? Thus, it inevitably affects everyone who utilizes the WiFi on at least some RPi models including 4 and Zero 2 W?

pelwell commented 7 months ago

Yes - all Pis prior to the Pi 5 use a non-bus-mastering MMC interface to control the WiFi device over an 4-bit SDIO interface, clocked no faster than 50MHz (single data item per cycle, not double). Pi 5 improves on this with double data rate support and more capable SDIO controller that offloads the main CPU.

fableman73 commented 7 months ago

Yes - all Pis prior to the Pi 5 use a non-bus-mastering MMC interface to control the WiFi device over an 4-bit SDIO interface, clocked no faster than 50MHz (single data item per cycle, not double). Pi 5 improves on this with double data rate support and more capable SDIO controller that offloads the main CPU.

So this solution is fake then? remove cap-sdio-irq in dts and fix cpu usage from 20% to 0.5%

Could anyone please at least test it? (I don't know how)

pelwell commented 7 months ago

remove cap-sdio-irq in dts and fix cpu usage from 20% to 0.5%

You cannot remove that which isn't there. This may do something useful on other platforms, but not on a Raspberry Pi.

fableman73 commented 7 months ago

You cannot remove that which isn't there. This may do something useful on other platforms, but not on a Raspberry Pi.

Okey, strange, the topic was regarding RPi and high CPU when using WIFI.