Closed svenfuchs closed 2 years ago
I have now moved sending out log messages via UDP to an async setup using a queue and a worker task. Apparently the whole thing is now a lot more stable.
However, after a successful OTA update (which btw also downloads the binary a lot quicker now) and restart it still sometimes gets stuck in the state where the OTA update HTTP GET request won't be able to get the hostname and getaddrinfo
fails with 202.
Hello, @svenfuchs
Sorry for the late response. I was investigating this issue, but unfortunately could not reproduce it (with esp_http_client
and advanced_https_ota
examples).
Here is a similar issue - https://github.com/espressif/esp-azure/issues/55, suggesting that esp-tls: couldn't get hostname for ****
could be a memory issue.
esp_get_free_heap_size()
and esp_get_minimum_free_heap_size()
just before performing the HTTP requests (and/or wherever you are observing the error)?uxTaskGetStackHighWaterMark()
?TRANSPORT_BASE: Failed to open a new connection: 32774
points to ESP_ERR_ESP_TLS_CONNECTION_TIMEOUT
. I couldn't reproduce it with the advanced_https_ota
example, so could you please check your server-side config for that? You can also increase the network timeout as given in here.Thanks for the response @laukik-hase!
I had to move on to other things in the meanwhile, but before I did I have managed to reduce the instabilities quite a bit. (I am not sure what exactly it was that helped with that in the end, but I remember tweaking memory a lot.)
I am going to close this issue. I might get back to this if I am still seeing these issues once I can pick up the project again, but in the meantime it makes no sense to keep it open.
Thanks again!
Apologies if this would be supposed to go to a forum. It seems as much a bug report as a question to me, but let me know if it should be moved.
An application I am working on is supposed to act as a simple thermostat/heating control. It consists of an esp32, DHT22 thermometer, OLED display, and a relay to control a heating device. It performs network requests for 3 functions:
The issues I am seeing are:
The HTTP related tasks will intermittently not be able to select the socket and establish a new connection, and show the following error log:
if I use the local IPv4 address of my machine as part of the OTA address:
if I use the local hostname of my machine
sven.local
as part of the OTA address:Or for sending metrics:
Sending the UDP message sometimes results in
Not enough space (12)
errors when returning fromsendto
.Turning off all but one of the three network related tasks (e.g. only keeping the OTA task) does not make the remaining one function reliably.
The code for running the OTA update is taken from the advanced HTTP OTA example, which when run locally fails with:
... where, as far as I can tell,
32774
isESP_ERR_HTTPD_RESP_SEND
, which seems to happen when something goes wrong during sending either HTTP headers or the request body?The trouble is that "intermittent" above means something like:
idf.py erase-flash
and thenidf.py flash monitor
, all won't necessarily help)Sometimes, but not always, the OTA update
GET
request starts failing and at the same time UDP messages fail to get sent (with the mentionedNot enough space (12)
error message. Sometimes, but most often not, either one of these issues or both will resolve after a couple minutes (usually to come back after a short while again).Sometimes, but not always, the request alternates failing and succeeding. I.e. it fails once or twice, and then succeeds, only to start failing again a couple times.
Thinking that
Not enough space
indicates a lack of free heap or some task's stack I have added debug output listing those (this has required tinkering withsdkconfig
for a while in order to resolve various stack overflows in order to get debug log output working):It would seem to me these numbers should be high enough (but maybe I'm wrong?), and I am now thinking they are probably unrelated to the "space" mentioned in the error message. I am also not sure these two errors even are related (socket select timeouts for HTTP requests vs failures to send UDP messages).
I am poking at this on two identical esp32-devkit-v4 boards (one of them soldered onto a diy pcb, the other one wired on a breadboard; one of them alternating on two different power supplies), not seeing any difference in realiability of the network or reproducibility of the issue(s).
The code can be found at https://github.com/svenfuchs/box, sans
sdkconfig
as I haven't found any other way to keep credentials out of version control.My
sdkconfig
config can be found here.