OpenNuvoton / MA35D1_linux-5.10.y

MA35D1 Linux 5.10.y
Other
7 stars 8 forks source link

Gigabit Ethernet poorly iperf performance #5

Open calonsoecler opened 1 year ago

calonsoecler commented 1 year ago

I got poorly iperf results testing NuMaker-IoT-MA35D1-A1 board Gigabit Ethernet (ETH0).

I could not achieve 1Gbps in standard iperf tests (got around ~750 Mbits/sec) but during iperf bidirectional test results have been really poor not achieving 1Gbps in (TX: ~500 Mbits/sec and RX: ~200 Mbits/sec).

In addiction during iperf test the CPU is highly loaded (CPU0 100%, CPU1 60%). Using top I have seen that one of the elements that is high loading the CPU is ksoftirqd:

Mem: 41876K used, 360564K free, 40K shrd, 432K buff, 8144K cached
CPU:  1.1% usr 17.0% sys  0.0% nic 31.5% idle  0.0% io  0.0% irq 50.1% sirq
Load average: 0.30 0.07 0.02 3/80 356
  PID  PPID USER     STAT   VSZ %VSZ CPU %CPU COMMAND
    9     2 root     RW       0  0.0   0 49.7 [ksoftirqd/0]
  352   326 root     S     248m 63.1   1 18.0 iperf -c 10.10.10.100 -i 1 -t 30 -d

Here are the iperf output too:

/ # iperf -c 10.10.10.1 -i 1 -t 10
------------------------------------------------------------
Client connecting to 10.10.10.1, TCP port 5001
TCP window size: 85.0 KByte (default)
------------------------------------------------------------
[  3] local 10.10.10.100 port 52930 connected with 10.10.10.1 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0- 1.0 sec  85.6 MBytes   718 Mbits/sec
[  3]  1.0- 2.0 sec  85.9 MBytes   720 Mbits/sec
[  3]  2.0- 3.0 sec  87.0 MBytes   730 Mbits/sec
[  3]  3.0- 4.0 sec  86.9 MBytes   729 Mbits/sec
[  3]  4.0- 5.0 sec  88.1 MBytes   739 Mbits/sec
[  3]  5.0- 6.0 sec  87.0 MBytes   730 Mbits/sec
[  3]  6.0- 7.0 sec  88.8 MBytes   744 Mbits/sec
[  3]  7.0- 8.0 sec  89.0 MBytes   747 Mbits/sec
[  3]  8.0- 9.0 sec  89.0 MBytes   747 Mbits/sec
[  3]  9.0-10.0 sec  89.1 MBytes   748 Mbits/sec
[  3]  0.0-10.0 sec   876 MBytes   734 Mbits/sec
/ # iperf -c 10.10.10.1 -i 1 -t 10 -d
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size:  128 KByte (default)
------------------------------------------------------------
------------------------------------------------------------
Client connecting to 10.10.10.1, TCP port 5001
TCP window size: 85.0 KByte (default)
------------------------------------------------------------
[  4] local 10.10.10.100 port 56938 connected with 10.10.10.1 port 5001
[  5] local 10.10.10.100 port 5001 connected with 10.10.10.1 port 35472
[ ID] Interval       Transfer     Bandwidth
[  4]  0.0- 1.0 sec  53.6 MBytes   450 Mbits/sec
[  5]  0.0- 1.0 sec  35.1 MBytes   294 Mbits/sec
[  4]  1.0- 2.0 sec  67.9 MBytes   569 Mbits/sec
[  5]  1.0- 2.0 sec  18.6 MBytes   156 Mbits/sec
[  4]  2.0- 3.0 sec  70.8 MBytes   593 Mbits/sec
[  5]  2.0- 3.0 sec  20.6 MBytes   173 Mbits/sec
[  4]  3.0- 4.0 sec  68.8 MBytes   577 Mbits/sec
[  5]  3.0- 4.0 sec  20.5 MBytes   172 Mbits/sec
[  4]  4.0- 5.0 sec  65.5 MBytes   549 Mbits/sec
[  5]  4.0- 5.0 sec  19.2 MBytes   161 Mbits/sec
[  4]  5.0- 6.0 sec  68.5 MBytes   575 Mbits/sec
[  5]  5.0- 6.0 sec  20.1 MBytes   169 Mbits/sec
[  5]  6.0- 7.0 sec  18.8 MBytes   158 Mbits/sec
[  4]  6.0- 7.0 sec  69.2 MBytes   581 Mbits/sec
[  5]  7.0- 8.0 sec  18.8 MBytes   158 Mbits/sec
[  4]  7.0- 8.0 sec  61.1 MBytes   513 Mbits/sec
[  4]  8.0- 9.0 sec  59.9 MBytes   502 Mbits/sec
[  5]  8.0- 9.0 sec  18.8 MBytes   158 Mbits/sec
[  4]  9.0-10.0 sec  61.1 MBytes   513 Mbits/sec
[  4]  0.0-10.0 sec   646 MBytes   542 Mbits/sec
[  5]  9.0-10.0 sec  19.0 MBytes   160 Mbits/sec
[  5]  0.0-10.0 sec   210 MBytes   176 Mbits/sec
[SUM]  0.0-10.0 sec   245 MBytes   205 Mbits/sec

I'm using last 5.10 kernel from this repository.

Do you know how to fix this issue?

yclu-ntc commented 1 year ago

Here is my testing result with defconfig device : IOT board <--> 1G router <--> windows PC iperf version of remote PC : 2.0.10 image

image It looks fine on my environment.

A known way to improve iperf performance is configuring 'compiler optimization level' to -O2 for kernel. The result of bidirectional test achieves ~550M/s on both Rx & Tx. Hope it helps!

calonsoecler commented 1 year ago

Thank you for answer. Your results looks better but I was expecting results higher than 900Mbps in the bidirectional test (you are only getting less than 500Mbps) and the CPU are highly loaded during test (ksoftirqd process increments a lot the CPU load). So I'm guessing maybe there's something related with the kernel