espressif / esp-idf

Espressif IoT Development Framework. Official development framework for Espressif SoCs.
Apache License 2.0
12.93k stars 7.09k forks source link

BLE throughput woes (IDFGH-12642) #13637

Open mickeyl opened 2 months ago

mickeyl commented 2 months ago

Answers checklist.

General issue report

I'd like to raise this issue, because I'm pretty unsatisfied with the BLE throughput on ESP32S3 (and ESP32C3).

In https://github.com/mickeyl/esp-nimble-cpp/tree/l2cap-channel-refactor, I have added C++ wrappers for the L2CAP connection oriented channels as well as demo programs to examine the throughput.

In my tests, using a COC MTU of 5000 bytes, I get a throughput of about 12K/sec between two ESP32S3. These 96KBit/sec are nowhere near the 1000KBit/sec for the 1M PHY, let alone the 2000KBit/sec for the 2M PHY. Although I know that these best case values do not translate to net throughput, I still believe that less than 100KBit/sec is much too low for BLE5 in practice.

Changing the connection parameters does not seem to have a major effect for me, I tried with multiple values.

What's your assessment on that?

bitbank2 commented 2 months ago

Are you using ACKs (normal write operation)? or write_without_response? Is your RF environment busy? With ACKs and a busy 2.4GHz environment, you'll see lots of delays/hiccups in sending bulk data.

mickeyl commented 2 months ago

This is L2CAP, not GATT. There are no GATT ACKs involved. The 2.4GHz band is pretty free in my space, as I have most, if not all, my other devices in the 5GHz band.

jonsmirl commented 2 months ago

Do you have wifi turned on in the ESP? BLE and wifi share the same radio. So if the radio is listening for WIFI packets when a BLE packet arrives the BLE packet gets dropped triggering a retransmission which ruins throughput. Note that you don't have to see WIFI packets on the sniffer for this to impact you, the WIFI protocol requires the ESP to listen for WIFI packets and it if is listening to WIFI it isn't listening to BLE.

We are debating on adding an external BLE chip to address this issue.

Edit: Turn off wifi and try your tests again.

KaeLL commented 2 months ago

12789 might be of interest.

mickeyl commented 2 months ago

@jonsmirl WiFi is turned off in these tests.

mickeyl commented 2 months ago

@KaeLL Thanks for pointing me to #12789. While I did read this already, I didn't really get it until now.

So with LE DATA LENGTH set to 251 (for both sides of the LE connection) instead of the default 27, there is indeed a 250% throughput increase available.

Without any more changes, I'm now at ~30KB/second (240KBit/second), which -- while still being far away from the theoretical maximum of BLE 5) -- feels much better.

Interestingly, in mixed vendor scenarios the results vary a lot depending on the client and the server roles: With an Apple device as peripheral, I get much better values than with an Apple device as central, although Apple does only support a maximum L2CAP SDU of 2048. (Nearby: Apple devices seem to not support connection intervals smaller than 30ms. The only thing they accept is the supervision timeout)

So at the end of the day, I'm not sure whether there is still a "bug" here, although I'd applaud all kinds of insights by the Espressif team. There are so many performance guides with regards to speed and memory, but there is little about BLE (L2CAP), so at least the documentation team could improve things here.

KaeLL commented 2 months ago

@rahult-github @igrr feel free to chime in. This is important.

Scottapotamas commented 2 months ago

Not trying to kick a dead horse here, but I'd love to see at least some form of acknowledgement from Espressif.

For anyone interested in the larger latency benchmarking effort and longerform discussion, it's available here: https://electricui.com/blog/latency-comparison

mickeyl commented 2 months ago

Ok, many hours of experimentation later, I need to correct myself (https://github.com/espressif/esp-idf/issues/13637#issuecomment-2066727982). At least some Apple devices do accept 15ms, if you set the other parameters in a way that Apple accepts. Of tremendous help was a spreadsheet I found here.

This is now my (esp-nimble-cpp) GATT connection callback:

void GATTServerCallbacks::onConnect(NimBLEServer* pServer, NimBLEConnInfo& connInfo) {
    // BLE Throughput Booster #1, courtesy of learning from BLE 4.2 standard documents
    pServer->setDataLen(connInfo.getConnHandle(), 251);
    // BLE Throughput Booster #2, courtesy of finding https://e2e.ti.com/support/wireless-connectivity/bluetooth-group/bluetooth/f/bluetooth-forum/296641/spreadsheet-to-check-whether-ble-parameters-are-within-apple-specs
    pServer->updateConnParams(connInfo.getConnHandle(), 12, 12, 0, 200);
}

With this (magic) incarnation, I seem to get a whopping 60KB/sec via an L2CAP COC when the Apple device is the central -- stable over the course of several minutes. This is a quarter of the maximum theoretical throughput with the 2M PHY and something I can very well live with.

More thorough long time tests will follow, but I wanted to drop the bomb as soon as possible for you to reproduce.