Open gschorcht opened 5 years ago
I noticed that with esp_now
the esp8266 would lock up after a few minutes (when connected to a border router).
It only prints
2021-05-03 15:13:17,642 # scandone
2021-05-03 15:13:27,381 # LmacRxBlk:0
2021-05-03 15:13:28,382 # LmacRxBlk:0
2021-05-03 15:13:29,383 # LmacRxBlk:0
2021-05-03 15:13:30,384 # LmacRxBlk:0
2021-05-03 15:13:31,384 # LmacRxBlk:0
2021-05-03 15:13:32,385 # LmacRxBlk:0
2021-05-03 15:13:33,386 # LmacRxBlk:0
2021-05-03 15:13:34,387 # LmacRxBlk:0
2021-05-03 15:13:35,387 # LmacRxBlk:0
2021-05-03 15:13:36,388 # LmacRxBlk:0
and does not react to shell input anymore.
(can be triggered by ping -f
to the esp8266's address)
Description
During the stress test of the
esp_wifi
module for esp8266, including de-authentication attacks, the following issues sporadically occurredReconnecting may fail after deauthentication and lead to system crash while excessive traffic is being sent to the esp8266. If the AP send a deauthentication, esp8266 tries to reconnect automatically. If there is only normal network load, the reconnect works as expected. However, if excessive traffic is being sent to the esp8266, it cannot reconnect and tries to repeat it until the memory is exhausted and it crashes. The memory seems to be consumed by the Espressif SDK :worried:
The problem might be related to problem 6.
~Send function may block completely on very heavy network load. Disconnecting and reconnecting helps sometimes but not always. Then, esp8266 has to be rebooted.~ Solved with PR #10862
~Sporadically,
LoadProhibitedCause
exception occurs on very heavy network load.~ Seems to be solved by PR #10869.~GNRC packet buffer runs full on very heavy network load since packets are hanging in the packet buffer. The communication with the esp8266 is no longer possible. Packet buffer can be checked with command
pktbuf
using modulegnrc_pktbuf_cmd
.~ Seems to be solved by PR #10862.~Sporadically, error message
dev 1500
occurs on very heavy network load and esp8266 crashes after that withLoadProhibitedCause
exception.~ Seems to be solved by PR #10869.Connecting to the access point while excessive traffic is being sent to the esp8266 often fails and a repetitive error message
LmacRxBlk: 1
appear. esp8266 is then not usable at all and has to be reset. This might be related to problem 1 when trying to reconnect while excessive traffic is being sent to the esp8266.The problem can be reproduced if at least one host is pinging the esp8266 with the maximum data size and an intervall of 0 while esp8266 is trying to connect to the AP. Start pinging first and then reset the esp8266.
According to network resources, error message
LmacRxBlk:1
means that the internal MAC layer buffer has an overflow. The problem normally occurs when an interrupt service routing takes longer than the allowed 10 µs. It may also be that the esp8266 has a performance that is too low to handle such a large amount of frames while connecting, see https://github.com/peterhinch/micropython-mqtt/issues/3#issuecomment-354245006.From today's perspective, this problem can't be solved with the means provided by the SDK.
Steps to reproduce the issue
Ping one esp8266 node from three different machines with different data sizes as fast as possible:
Expected results
All these problems above only occur on very heavy network load. Under normal conditions
esp_wifi
is working stable, for example under following conditions: