openwrt / openwrt

This repository is a mirror of https://git.openwrt.org/openwrt/openwrt.git It is for reference only and is not active for check-ins. We will continue to accept Pull Requests here. They will be merged via staging trees then into openwrt.git.
Other
20.46k stars 10.55k forks source link

Netgear R7800 performance drop in 23.05.0-rc1 #12920

Open robodogspok opened 1 year ago

robodogspok commented 1 year ago

Describe the bug

This is copy of message of Openwrt forums:

Link for forum post also. https://forum.openwrt.org/t/netgear-r7800-performance-drop-in-23-05-0-rc1/162676/1

I installed 23.05.0-rc1 today on my R7800. Network throughput for ethernet-connected LAN devices dropped substantially. I reinstalled 22.03.5 and performance went back to normal. ISP is Verizon Fios (Boston MA area), provisioned at 300Mbps symmetric.

From https://www.speakeasy.net/speedtest/ results, a sample below. I see similar degradation with speedtest ookla app, and with bufferbloat at https://www.waveform.com/tools/bufferbloat 1.

Switching back and forth between 22.03.5 and 23.05.0-rc1 consistently shows 23.05.0-rc1 with a throughput loss.

------------------------------ // ----------------------------

It seems there is performance related regression. I did not found bug report in Github. That why I posted bug report.

There was also message about: It is likely about the CPU cache frequency scaling changes in kernel 5.15 and 6.1.

OpenWrt version

r23069-e2701e0f33

OpenWrt target/subtarget

ipq806x/generic

Device

Netgear Nighthawk X4S R7800

Image kind

Official downloaded image

Steps to reproduce

When installed new version there much worse network performance. So network throughput for ethernet-connected LAN devices dropped substantially.

Actual behaviour

No response

Expected behaviour

It was expected that performance would be normal like in Openwrt version 22.03.5.

Additional info

It is likely about the CPU cache frequency scaling changes in kernel 5.15 and 6.1.

Diffconfig

No response

Terms

brada4 commented 1 year ago

Can you repeat your tests with 'performance' and 'powersave' governors?

brada4 commented 1 year ago

Also install irqbalance, it may help if network card has multiple processing queues attached to multiple interrupts and some other kernel change does not balance IRQs, by moving those on 2nd CPU.

motolav commented 1 year ago

https://github.com/openwrt/openwrt/commit/6f9495b896f53172855cd015ac4024f6b7758e0a is in the 23.05 tree, rc2 should have improved performance at the cost of heat and power usage.

motolav commented 1 year ago

If you want to test for yourself you can grab a 23.05-snapshot here, https://downloads.openwrt.org/releases/23.05-SNAPSHOT/targets/

Luckyparty commented 1 year ago

Tested with latest 23.05 snapshot of today. No packages installed except iperf3. Getting only 250 MBit/s in upstream direction. Downstream looks good (930 MBit/s). I have two R7800. One is still running 19.07.10. I did the same test with this router again and I am getting 930 MBit/s in both directions. So there is still a performance issue with 23.05. All routers are in the same network. The iperf server to test with is an Archer C2600 (ipq806x). Also tested with a Windows iperf server but the result is nearly identical.

Went back to 22.03.5 and did the test again. 930 MBit/s in both directions. So the problem is clearly in between kernel 5.10.176 and 5.15

motolav commented 1 year ago

@Luckyparty That's likely a different issue as the original performance drop was intentional through the kernel config to try keep ipq806x stable without switching to the performance governor

Luckyparty commented 1 year ago

@motolav I just thought this is worth mentioning since it has major impact on network performance. Maybe someone else can confirm this problem.

brada4 commented 1 year ago

Hint: /sys/devices/system/cpu/cpufreq/*/scaling_available_governors

Luckyparty commented 1 year ago

@brada4 The scaling_governor was already set to performance. Using one of the other governors results in even less upstream throughput. So this does not solve the issue. Between 22.03.5 and 23.05 I realized that the scaling_driver has changed from krait-cpufreq to cpufreq-dt.

brada4 commented 1 year ago

(Not related to op thus)

sapphonie commented 1 year ago

Tested with latest 23.05 snapshot of today. No packages installed except iperf3. Getting only 250 MBit/s in upstream direction. Downstream looks good (930 MBit/s). I have two R7800. One is still running 19.07.10. I did the same test with this router again and I am getting 930 MBit/s in both directions. So there is still a performance issue with 23.05. All routers are in the same network. The iperf server to test with is an Archer C2600 (ipq806x). Also tested with a Windows iperf server but the result is nearly identical.

Went back to 22.03.5 and did the test again. 930 MBit/s in both directions. So the problem is clearly in between kernel 5.10.176 and 5.15

Can repro similar results with regard to 22.03.5 and latest dev snapshot (r23684-881235c713 / LuCI Master git-23.158.78004-23a246e)

sapphonie commented 1 year ago

Perhaps related to the two removed patches in https://git.openwrt.org/?p=openwrt/openwrt.git;a=commit;h=32f134fbdf027afc342caea17200728033747333 ? Maybe they were upstreamed poorly or incorrectly?

motolav commented 1 year ago

@sapphonie those patches have nothing to do with ipq806x, the reduced upstream bandwidth is probably from upstream changes in the kernel that got backported.

motolav commented 1 year ago

There is one difference on target/linux/generic/files/drivers/net/phy/ar8327.c(the switch driver) between 22.03 and main/23.05 But I have no idea if that'd cause any issues

olemmela commented 1 year ago

@robodogspok could you test if #13323 fixes your performance issues

motolav commented 1 year ago

@Luckyparty are you able to test if olemmela's patch fixes the upstream bandwidth issue https://github.com/openwrt/openwrt/pull/13323

Luckyparty commented 1 year ago

@motolav at the moment I don't have a build environment available. So unfortunately no. But I am really curious if this fixes the upstream.

Luckyparty commented 1 year ago

Thanks to #13323 the upload bandwidth problem is fixed. Tested with latest 23.05.0 on a R7800.