The WAS WebSocket client sometimes fails to reconnect after receiving a WEBSOCKET_EVENT_CLOSED event due to it not being able to create the WebSocket client task:
I (13:15:28.958) WILLOW/WAS: WebSocket closed
I (13:15:28.959) WILLOW/WAS: initializing WebSocket client (ws://was.dev.willow:8502/ws)
E (13:15:28.961) websocket_client: Error create websocket task
E (13:15:28.967) WILLOW/WAS: failed to start WebSocket client: ESP_FAIL
When this happens, Willow stops trying to reconnect to WAS, and the user has to reset or power-cycle their device. As this is terrible UX, this should be avoided.
The root cause is not freeing resources of the WebSocket client before we reinitialize it in init_was. This eventually causes us to run out of IRAM, causing the WebSocket client task creation to fail.
To fix this, enable WebSocket destroy on exit in was_init, rather than in was_deinit, so it is always enabled. This is needed to avoid leaking resources due to not calling esp_websocket_client_destroy before calling esp_websocket_client_init. We don't want to call
esp_websocket_client_destroy, as that has proven to be very prone to crashes in the past.
Similarly, we don't want to call esp_websocket_client_start when the WAS WebSocket client receives a WEBSOCKET_EVENT_CLOSED event. This would avoid the resource leak, but testing has shown this is also very prone to crashes.
See the following commits for extra information:
985375c35f55 ("was: drop esp_websocket_client_destroy()") 1f3d2989dc16 ("was: drop esp_websocket_client_destroy() completely")
The WAS WebSocket client sometimes fails to reconnect after receiving a WEBSOCKET_EVENT_CLOSED event due to it not being able to create the WebSocket client task:
I (13:15:28.958) WILLOW/WAS: WebSocket closed I (13:15:28.959) WILLOW/WAS: initializing WebSocket client (ws://was.dev.willow:8502/ws) E (13:15:28.961) websocket_client: Error create websocket task E (13:15:28.967) WILLOW/WAS: failed to start WebSocket client: ESP_FAIL
When this happens, Willow stops trying to reconnect to WAS, and the user has to reset or power-cycle their device. As this is terrible UX, this should be avoided.
The root cause is not freeing resources of the WebSocket client before we reinitialize it in init_was. This eventually causes us to run out of IRAM, causing the WebSocket client task creation to fail.
To fix this, enable WebSocket destroy on exit in was_init, rather than in was_deinit, so it is always enabled. This is needed to avoid leaking resources due to not calling esp_websocket_client_destroy before calling esp_websocket_client_init. We don't want to call esp_websocket_client_destroy, as that has proven to be very prone to crashes in the past.
Similarly, we don't want to call esp_websocket_client_start when the WAS WebSocket client receives a WEBSOCKET_EVENT_CLOSED event. This would avoid the resource leak, but testing has shown this is also very prone to crashes.
See the following commits for extra information: 985375c35f55 ("was: drop esp_websocket_client_destroy()") 1f3d2989dc16 ("was: drop esp_websocket_client_destroy() completely")