openwrt / mt76

mac80211 driver for MediaTek MT76x0e, MT76x2e, MT7603, MT7615, MT7628 and MT7688
741 stars 342 forks source link

MT7981: MT7915_TX_RING_SIZE > 300 affects performance #902

Closed romanovj closed 1 month ago

romanovj commented 2 months ago

Two devices with MT7981 with the same drivers 05-17-2024, without WED, 160Mhz AX, AP <-> STA, iperf3 on devices. single stream

MT7915_TX_RING_SIZE = 2048 STA ---> AP

[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-30.00  sec  2.88 GBytes   826 Mbits/sec    0             sender
[  5]   0.00-30.01  sec  2.88 GBytes   825 Mbits/sec                  receiver

AP ----> STA

[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-30.01  sec  3.99 GBytes  1.14 Gbits/sec    0             sender
[  5]   0.00-30.00  sec  3.99 GBytes  1.14 Gbits/sec                  receiver

MT7915_TX_RING_SIZE = 256 STA ---> AP

[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-30.01  sec  3.99 GBytes  1.14 Gbits/sec    0             sender
[  5]   0.00-30.00  sec  3.99 GBytes  1.14 Gbits/sec                  receiver

AP ----> STA

[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-30.01  sec  4.51 GBytes  1.29 Gbits/sec    0             sender
[  5]   0.00-30.00  sec  4.51 GBytes  1.29 Gbits/sec                  receiver
rany2 commented 2 months ago

Confirmed on MT7915E as well. Interesting discovery.

zhaojh329 commented 2 months ago

I got a speed check. It really improved. Mainly reflected in STA -> AP.

lukasz1992 commented 2 months ago

I still wonder why we need so large buffers for TX. Perhaps it affects all wifi5 (and higher) devices that this driver handle.

There is one commit with quite interesting description: https://github.com/openwrt/mt76/commit/dbea5151412b224c430b8277bf3372eb48e6a7fc

On the other hand I wonder how about performance when multiple clients are connected, maybe here we need larger ring sizes. From previous commits it looks like increasing ring size fixes performance, on heavy load. On the other hand, filogic CPUs have more cpu power than MT7621; maybe it should be adjusted to CPU performance or even changable via kernel module parameter.

Fail-Safe commented 1 month ago

FWIW, it does seem that the MT7986 is unaffected by this same issue:

Device: GL.iNet GL-MT6000 WED: Disabled 160Mhz AX

MT7915_TX_RING_SIZE = 2048 STA --> AP

[ ID] Interval           Transfer     Bitrate
[  5]   0.00-10.00  sec  1.41 GBytes  1.21 Gbits/sec                  sender
[  5]   0.00-10.01  sec  1.41 GBytes  1.21 Gbits/sec                  receiver

AP --> STA

[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.01  sec  1.33 GBytes  1.14 Gbits/sec    1             sender
[  5]   0.00-10.00  sec  1.32 GBytes  1.14 Gbits/sec                  receiver
romanovj commented 1 month ago

@rany2 I'm also seeing packets drops and lots of retries with this patch 1006-wifi-mt76-mt7915-drop-undefined-action-frame.patch When client is 1x1 80Mhz AC (android phone );

rany2 commented 1 month ago

@romanovj I'll see if I can confirm

Fail-Safe commented 1 month ago

@rany2 Were you able to test after removal of the 1006-wifi-mt76-mt7915-drop-undefined-action-frame.patch patch?

rany2 commented 1 month ago

@Fail-Safe I forgot to update the issue but it does indeed reduce the retry count on an 802.11ac Android device (it went from several thousand retries in the first second to 1-2 which is reasonable). I dropped it on my tree but it is worth noting that I didn't have this issue everywhere (for example, with Intel AX201 as client this issue doesn't appear).

romanovj commented 1 month ago

@Fail-Safe I forgot to update the issue but it does indeed reduce the retry count on an 802.11ac Android device (it went from several thousand retries in the first second to 1-2 which is reasonable). I dropped it on my tree but it is worth noting that I didn't have this issue everywhere (for example, with Intel AX201 as client this issue doesn't appear).

that's shouldn't be like this, right? Should I open another issue?

romanovj commented 1 month ago

Perf diff 2048 vs 256 MT7915_TX_RING_SIZE with iperf3 on device. More traffic -> more cpu usage -> less idle calls.

    36.42%    -14.71%  [kernel.kallsyms]     [k] default_idle_call
     6.29%     -5.26%  [mt76]                [k] 0x0000000000000098

how to find function 0x0000000000000098 in mt76? should I build module with -O0?