atomvm / AtomVM

Tiny Erlang VM
https://www.atomvm.net
Apache License 2.0
1.43k stars 94 forks source link

Handle network_driver: Unhandled wifi event: 21. #1100

Open petermm opened 4 months ago

petermm commented 4 months ago
I (1593851) wifi:bcn_timeout,ap_probe_send_start
I (1593851) network_driver: Unhandled wifi event: 21.

repeatedly.. on a C3card - most likely poor Rssi..

Solution:

Silently handle it - as there already is the wifi:bcn_timeout,ap_probe_send_start log

or

gracefully log "bcn_timeout: Potentially weak wifi signal", or better wording.

What does bcn_timeout, ap_probe_send_start mean?

The STA does not receive the Beacon frame within the specified time (6 s by default for ESP32, equals to 60 Beacon Intervals). - The reason could be:

Insufficient memory. “ESP32_WIFI_MGMT_SBUF_NUM” is not enough (there will be errors like “esf_buf: t=8, l=beacon_len, …” in the log). You can check this by typing the heap size when received a Disconnect event. The AP did not send a beacon. This can be checked by capturing beacons from AP. Rssi too low. When the Rssi value is too low in complex environments, the STA may not receive the beacon. This can be checked by retrieving Rssi values via esp_wifi_sta_get_ap_info. Hardware related issues. Bad package capturing performance. When there is a bcn_timeout, the STA will try to send Probe Request for five times. If a Probe Response is received from the AP, the connection will be kept, otherwise, the STA will send a Disconnect event and the connection will fail.

UncleGrumpy commented 4 months ago

This could also be due to the beacon interval used by the router, if it is set for a longer than normal time, or itself is suffering degraded performance due to congestion, may exaggerate this problem. And as you point out, it could be poor RSSI, which, compounded with a longer beacon interval could trigger this callback if just a single beacon ping is missed.

UncleGrumpy commented 3 months ago

I looked closer at esp-idf/components/esp_wifi/include/esp_wifi.h the beacon timeout is actually due to the event 21 which is ESP_ERR_WIFI_NOT_ASSOC, the WiFi connection is not associated with an access point. At what point in the application execution does this happen, when starting the network, or at some point after? Does any network activity succeed before this error occurs?

petermm commented 3 months ago

cc @arpunk - whom I believe reported it on telegram..

arpunk commented 3 months ago

@UncleGrumpy @petermm this only happened to me during that session and haven't seen that log warning again, but I agree that instead of event 21 a possible ESP_ERR_WIFI_NOT_ASSOC would be more descriptive.

UncleGrumpy commented 3 months ago

The error 21 is being reported directly by the esp-idf, and as you pointed out, a handler for this could be useful, the default should just report a more useful error, but users could override that with a custom handler.

UncleGrumpy commented 3 months ago

My mistake... I was accidentally looking error codes and not event codes 🤦🏾‍♂️. The unhandled wifi event 21 is indeed WIFI_EVENT_STA_BEACON_TIMEOUT, I don't think there is anything a user can to in their application to remedy this situation, unless someone builds a wifi connected autonomous robot, they may need to have a handler for this event to have the robot move closer to the AP. For now we can log a more helpful error. I believe if enough beacon timeouts occur in a row a WIFI_EVENT_STA_DISCONNECTED will be emitted.

UncleGrumpy commented 2 months ago

PR #1137 adds a handler and a configurable (optional) STA mode callback for the beacon_timeout event. Also improves the log message, so it will be helpful, even if the callback isn't used.