espressif / esp-idf

Espressif IoT Development Framework. Official development framework for Espressif SoCs.
Apache License 2.0
13.7k stars 7.3k forks source link

ESP32 wifi rx / tx throughput imbalance (IDFGH-3012) #5042

Closed seppf closed 3 years ago

seppf commented 4 years ago

Environment

Development Kit: ESP32-Ethernet-Kit Kit version: v1.1 Module or chip used: ESP32-WROVER-IB IDF version: v4.2-dev-701-g0ae960f2f Build System: CMake Compiler version: xtensa-esp32-elf-gcc (crosstool-NG esp-2019r2) 8.2.0 Operating System: Windows (Windows only) environment type: Plain Command Prompt Using an IDE?: Yes (Eclipse Plugin) Power Supply: [USB]

Problem Description

We have tested the throughput performance of a esp32 running as a wifi to ethernet bridge. Testing application is (essentially) the eth2ap code sample. Testing setup is one station connected on wifi using iperf as a client against an iperf server on the ethernet side. The esp32 is only bridging, so there is no iperf (or stack in general) on the esp32.

When testing both directions (TX = data from wifi client to ethernet server, RX = the other way round) one after another, throughput is as reported in documentation, both in TX and RX (UDP ~30 MBit/s, TCP ~20 MBit/s). When testing both directions simultaneously (iperf "dual" option) we observe a rather strong asymmetry in throughput. With UDP TX : RX is about 3 : 1, but with TCP it is as extrem as 15 : 1 (i.e. 20 Mbit/s to 1,5 MBit/s) ! The difference of UDP and TCP can be understood by the Nagles algorithm that amplifies any existing basic asymmetry in the TX / RX channel.

To make sure the observed asymmetry is actually an issue of the esp32, we have made the identical iperf measurements on different access points. An AVM Fitzbox 7490 only showed small asymmetry, a Ruckus ZF 7352 practically showed no asymmetry.

Such an (undiserable!) asymmetry in RX / TX throughput can be an indication for an issue with the 802.11 arbitration, in detail an issue with the proper calculation / parametrization of the interframe spaces or the arbitrary contention window (backoff slots). This could e.g. be caused by the value of the slot length used. In the beacons short slot length (i.e. 9 us for g/n) is indicated, but possibly long slot length (20 us) is actually used when sending wifi packets from the esp32. This could explain, why wifi packets sent from the esp32 soft access point are less priorized as compared to packets sent from stations. Another possible explanation could be that "normal" data frames - though correctly marked as "best effort" - are actually sent as "background" traffic.

However, this is plain guessing, of course.

Best Regards, Josef

Alvin1Zhang commented 4 years ago

@seppf Thanks for your detailed report, we will look into.

liuzfesp commented 4 years ago

HI @seppf , could you let me know your sdkconfig?

seppf commented 4 years ago

Hi @Alvin1Zhang

I upload the sdkconfig which is, however, 1 to 1 the config of the eth2ap example. For testing I had to change one of the defines of the example source to get it running reasonable:

define FLOW_CONTROL_WIFI_SEND_TIMEOUT_MS (0)

The default value of 100 (msec) didn't work in my network setup as there seem to be some packets coming from my LAN and arriving on the ethernet interface of esp32 that can not be transported on the wifi (if I remember right those gave error message -16) and thus cause considerable delays when trying to retransmit those with a high number of attempts. Possibly those packets can be filtered before trying to send on wifi. But I didn't pay enough attention to this effect till now.

I enclose a jpeg ("esp32_iperf.jpeg") with the iperf results of my testing. Left half of the diagram is a iperf run in "TCP trade" mode, i.e. first there is upload (amber, data from wifi station via esp32 to iperf server on ethernet side) then there is download (blue, data from ethernet to wifi). On the right half of the diagram there is the iperf test in "TCP dual" mode. You will notive the imbalance in RX / TX.

For comparison I made the same test with a Fritz7490 ("fritz7490_iperf.jpeg"). In the "dual" mode test up- and download bandwidth is much better balanced.

Needless to say that testing with TCP may be tricky as bandwidth is both influenced by up- and download bandwidth. So imbalances may well have other reasons than wifi priorization. However, we have done quite some tests e.g. with optimizing buffers and buffer strategies in the wifi to ethernet bridge. That gave some effect on maxium bandwidth and also on balance but not to an extend that could explain the imbalance found with esp32. Unforunately the wifi driver is not released as open source, so we can not test inside that section ...

Best Regards, Josef

sdkconfig.txt Iperf test with esp32 esp32_iperf same test with other access point fritz7490_iperf

liuzfesp commented 4 years ago

Hi @seppf, we will debug this issue, but it may take some time.

MaxwellAlan commented 3 years ago

hi @seppf , thanks for reporting this issue. According your descriptions, we have reproduced this issue and debug that. We found that there are two reasons result in this issue:

  1. eth2ap example include a wifi tx flow control task which will decrease ESP32 AP tx performance.

  2. the default sdkconfig of this example is not so appropriate, maybe you can optimize ethernet like this:

Last but not least, this example is just an experiment demo, which lacks of some optimization for performace test. If you want to do some test to evalute wifi performace, we suggest using example/wifi/iperf to do that. Thanks.

Alvin1Zhang commented 3 years ago

Thanks for reporting, feel free to reopen.