Open jckkk opened 2 years ago
Did some more testing and it appears that esphome will crash every time if the response is not provided within around 4 seconds. The below will consistently crash if the response is provided after 5 seconds. Apparently, the watchdog timer is not being reset during waiting. Basically, this component is hardly usable on ESP32 due to this issue.
esphome:
platform: ESP32
board: nodemcu-32s
name: temp-display
logger:
level: DEBUG
wifi:
ssid: !secret wifi_ssid
password: !secret wifi_password
http_request:
useragent: esphome/crashing-sensor
timeout: 15s
interval:
interval: 15s
then:
- logger.log:
level: INFO
format: "Requesting"
- http_request.get:
url: http://192.168.8.113:8080/wait
headers:
Content-Type: application/json
I'm experiencing the same issue on ESPHome 2022.4.0. Any way to get around this?
I'm experiencing the same issue on ESPHome 2022.4.0. Any way to get around this?
None that I'm aware of. This component is basically useless in this state.
This seems to be a general issue with watchdog on ESP32. There are quite a few other open issues where WDT triggers restarts when too much is going on.
A similar issue here, I'm using HTTP Request as a call-home feature to trigger a webhook on a remote server, so I know ESP is still alive. It worked witout problem for about 2-3 Months. Today my server is down because of maitanence and ESP32 is crashing every 5 min (which is also the interval they send out POST to my webhook)
In short, ESP32 crashes when the server is unavailable
Could it be because http_request component is making synchronous requests? If so, then probably 5 second timeout is the maximum value allowed for now? Could be the component re-implemented using asynchronous calls (https://www.arduino.cc/reference/en/libraries/asynchttprequest_generic/)?
+1
+1
Same issue here. Was thinking it was something I'd done all day.
ESPHome Version: 2022.10.0
[D][sensor:127]: 'Bottom Temperature': Sending state 18.31250 °C with 1 decimals of accuracy
[D][dallas.sensor:143]: 'Top Temperature': Got Temperature=18.9°C
[D][sensor:127]: 'Top Temperature': Sending state 18.93750 °C with 1 decimals of accuracy
[D][dallas.sensor:143]: 'Bottom Temperature': Got Temperature=18.3°C
[D][sensor:127]: 'Bottom Temperature': Sending state 18.31250 °C with 1 decimals of accuracy
[I][main:098]: Updating top: 18.9 bottom: 18.3
E (299134) task_wdt: Task watchdog got triggered. The following tasks did not reset the watchdog in time:
E (299134) task_wdt: - loopTask (CPU 1)
E (299134) task_wdt: Tasks currently running:
E (299134) task_wdt: CPU 0: IDLE0
E (299134) task_wdt: CPU 1: loopTask
E (299134) task_wdt: Aborting.
abort() was called at PC 0x40156a78 on core 0
ELF file SHA256: 0000000000000000
Backtrace: 0x40088ce8:0x3ffbf820 0x40088f65:0x3ffbf840 0x40156a78:0x3ffbf860 0x400873cd:0x3ffbf880 0x4018b897:0x3ffbc160 0x40158383:0x3ffbc180 0x4008b725:0x3ffbc1a0 0x40089f76:0x3ffbc1c0
Rebooting...
ets Jun 8 2016 00:22:57
rst:0xc (SW_CPU_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:2
load:0x3fff0018,len:4
load:0x3fff001c,len:1044
load:0x40078000,len:10124
load:0x40080400,len:5828
entry 0x400806a8
[I][logger:243]: Log initialized
[C][ota:465]: There have been 1 suspected unsuccessful boot attempts.
[D][esp32.preferences:113]: Saving 1 preferences to flash...
[D][esp32.preferences:142]: Saving 1 preferences to flash: 0 cached, 1 written, 0 failed
[I][app:029]: Running through setup()...
Example snippet from yaml
# Make periodic update to google
time:
- platform: sntp
id: sntp_time
on_time:
# Every 5 minutes
- seconds: 0
minutes: /5
then:
- if:
condition:
lambda: 'return id(top_temp).has_state() && id(bottom_temp).has_state();'
then:
- output.turn_on: status_led
- logger.log:
level: INFO
format: "Updating top: %.1f bottom: %.1f"
args: [ 'id(top_temp).state', 'id(bottom_temp).state' ]
- http_request.get:
verify_ssl: false
url: !lambda |-
char buf[256];
sprintf(buf, "https://script.google.com/macros/s/DeplymentID/exec?top=%.1f&bottom=%.1f", id(top_temp).state, id(bottom_temp).state);
return ((std::string) buf).c_str();
on_response:
then:
- wait_until:
condition:
wifi.connected:
timeout: 45s
- logger.log:
format: "Response status: %d"
args:
- status_code
- output.turn_off: status_led
I get such a non-permanent error when I try to control my relay from the esp 8266 board. At the same time, the request from the PC works. Has anyone found a solution? How do we bring this problem to the developers? My version of esp home v2022.10.0, it may be worth updating
Seems, like the default CONFIG_ESP_TASK_WDT_TIMEOUT_S
is 5 seconds, which equals to the default http_request
component timeout, so if request takes longer, it will be killed by watchdog. As a workaround you might try either:
http_request
timeout:
http_request:
timeout: 1s
#include <esp_task_wdt.h>
esp_task_wdt_init(15 /* timeout */, false /* panic */);
A better solution would be of course using some async http library, maybe just native ESP-IDF's http client will be a good alternative.
Hi, I have the same problem. I get watchdog induced restarts when the timeout is larger than 1s.
I am trying to send data to google sheet. It works, but I do not get a response from HTTP GET due to timeout.
Is there any workaround for this? Thanks.
Perhaps a watchdog definition in on_boot automation?
Hello,
Any news on that ? I also got this issue on ESP32 (event without 1s timeout). I don't have this on ESP8266. This bug make http component inusable on ESP32.
Since this bug has been there for a year, I would like this component to be marked as incompatible with ESP32
Same issue here. I have been debugging for days because pretty much any http_request fails and crashes the ESP until I finally found this issue.
Same here. This issue basically makes it almost impossible to use. It works for a few minutes and crashes again. I'm using it to call an API of a device in the same network, response is always very fast (under 1 second).
Seems, like the default
CONFIG_ESP_TASK_WDT_TIMEOUT_S
is 5 seconds, which equals to the defaulthttp_request
component timeout, so if request takes longer, it will be killed by watchdog. As a workaround you might try either:
- Decrease
http_request
timeout:http_request: timeout: 1s
- Increase watchdog timeout or disable panicking somewhere in your code (not sure what's the best place):
#include <esp_task_wdt.h> esp_task_wdt_init(15 /* timeout */, false /* panic */);
A better solution would be of course using some async http library, maybe just native ESP-IDF's http client will be a good alternative.
Has anyone implemented the second workaround?
Just adding my two pence, I created issue #4569 without noticing this issue (my apologies)
I have tried a lot of combinations of ESPHome versions and have had no luck. Going to have a look at @stas-sl 's suggestions. Cheers
Just tried reducing the Timeout to 4 seconds, still crashing.
I tried the C++ fix, by creating a file in the configuration directory named fixtimeout.h
and adding
esphome:
[...]
includes:
- fixtimeout.h
to my configuration Unfortunately I get this error:
Compiling .pioenvs/test/src/main.cpp.o
In file included from src/main.cpp:199:
src/fixtimeout.h:2:18: error: expected constructor, destructor, or type conversion before '(' token
esp_task_wdt_init(15 /* timeout */, false /* panic */);
^
*** [.pioenvs/test/src/main.cpp.o] Error 1
====================== [FAILED] Took 270.71 seconds ======================
I'd be interested in getting the C++ solution working if possible.
4 seconds might be still too big, there are actually 2 timeouts (connect timeout and receive, both of which set to 4 seconds, if I remember correctly, so in worst case full request might take 8 seconds, which is still bigger than 5 seconds. In many cases setting as low as 1 second could work if you don't need to analyze response status.
You can't call a function from a header file, you should include header and call esp_task_wdt_init
from 'on_boot' handler, see comment.
4 seconds might be still too big, there are actually 2 timeouts (connect timeout and receive, both of which set to 4 seconds, if I remember correctly, so in worst case full request might take 8 seconds, which is still bigger than 5 seconds. In many cases setting as low as 1 second could work if you don't need to analyze response status.
You can't call a function from a header file, you should include header and call
esp_task_wdt_init
from 'on_boot' handler, see comment.
I tried 1 second timeout and still getting crashes. Was getting connection refused instead of a 200 response if that makes any difference.
I also tried https://github.com/epiclabs-uc/esphome-nowatchdog-component, but the ESP still crashes unfortunately.
I'm also getting crashed but only when using ethernet withLAN8720 on an esp32 but when I use wifi its working without problems.
i also have crashes when doing an http request to dsmr. Hopefully this can be fixed soon and i can power my p1 meter through poe and get rid of the wifi connection.
i also have crashes when doing an http request to dsmr. Hopefully this can be fixed soon and i can power my p1 meter through poe and get rid of the wifi connection.
FYI, My devices are using POE Ethernet and it doesn't resolve the issue in my situation.
i also have crashes when doing an http request to dsmr. Hopefully this can be fixed soon and i can power my p1 meter through poe and get rid of the wifi connection.
FYI, My devices are using POE Ethernet and it doesn't resolve the issue in my situation.
it works for me when i use the wifi connection, as soon as i enable POE ethernet i have crashes and i get failures back for the post requests.
i also have crashes when doing an http request to dsmr. Hopefully this can be fixed soon and i can power my p1 meter through poe and get rid of the wifi connection.
FYI, My devices are using POE Ethernet and it doesn't resolve the issue in my situation.
it works for me when i use the wifi connection, as soon as i enable POE ethernet i have crashes and i get failures back for the post requests.
That's interesting!
I experience exactly the same problem when making HTTP request from one node to another via eth on esp32. I have smart home modules with exposed API and this pretty much prevents any direct communication between them. changing timeouts doesn't help.
Can we mark it as a bug? would anyone be able to pick it up?
I believe I am also having this issue, with the watchdog resetting my device and doing http_request.post every 2.5 minutes, usually within 600 ms, but sometimes seeing up to 3500ms
[16:49:03][D][http_request:099]: HTTP Request completed; URL: http://hw.airgradient.com/sensors/airgradient:xxxxxx/measures; Code: 200; Duration: 1357 ms
[16:51:32][W][http_request:086]: HTTP Request failed; URL: http://hw.airgradient.com/sensors/airgradient:xxxxxx/measures; Error: connection lost; Duration: 271 ms
[16:54:03][D][http_request:099]: HTTP Request completed; URL: http://hw.airgradient.com/sensors/airgradient:xxxxxx/measures; Code: 200; Duration: 581 ms
[16:56:33][D][http_request:099]: HTTP Request completed; URL: http://hw.airgradient.com/sensors/airgradient:xxxxxx/measures; Code: 200; Duration: 557 ms
[16:59:03][D][http_request:099]: HTTP Request completed; URL: http://hw.airgradient.com/sensors/airgradient:xxxxxx/measures; Code: 200; Duration: 560 ms
[17:01:33][D][http_request:099]: HTTP Request completed; URL: http://hw.airgradient.com/sensors/airgradient:xxxxxx/measures; Code: 200; Duration: 568 ms
[17:04:03][D][http_request:099]: HTTP Request completed; URL: http://hw.airgradient.com/sensors/airgradient:xxxxxx/measures; Code: 200; Duration: 558 ms
[17:04:48]
[17:04:48] ets Jan 8 2013,rst cause:4, boot mode:(3,7)
[17:04:48]
[17:04:48]wdt reset
[17:04:48]load 0x4010f000, len 3460, room 16
[17:04:48]tail 4
[17:04:48]chksum 0xcc
[17:04:48]load 0x3fff20b8, len 40, room 4
[17:04:48]tail 4
[17:04:48]chksum 0xc9
[17:04:48]csum 0xc9
[17:04:48]v000984e0
[17:04:48]~ld
[17:04:48][I][logger:351]: Log initialized
[17:04:48][C][ota:473]: There have been 0 suspected unsuccessful boot attempts.
[17:04:48][I][app:029]: Running through setup()...
I am seeing a similar issue with my Airgradient using an ESP32. If there is no internet connection, the request fails and ESP crashes. Since the airgradient esphome config also sends a HTTP POST on boot, sometimes it even fails to boot and gets stuck in a bootloop.
Here's a workaround based on @stas-sl suggestion. Thank you @stas-sl!
esp32_wdt.h:
#include <esp_task_wdt.h>
void set_esp32_wdt_timeout() {
esp_task_wdt_init(60 /* timeout [s] */, true /* panic */);
}
esphome:
includes:
- esp32_wdt.h
on_boot:
then:
- lambda: |-
set_esp32_wdt_timeout();
esp32_wdt.h:
So far this has worked great for my esp32 devices. Any chance you know of a similar workaround for esp8266?
The problem
The device crashes randomly when performing http_request. It seems that the watchdog triggers the restart.
It only occurs on ESP32 (tried on lolin32_lite, nodemcu-32s and mhetesp32minikit), but not ESP8266 (d1_mini).
Sample yaml reproduces the issue quite often, but not every time. It seems to be more consistent when there are some sensors attached.
Which version of ESPHome has the issue?
2021.12.1
What type of installation are you using?
pip
Which version of Home Assistant has the issue?
No response
What platform are you using?
ESP32
Board
lolin32_lite, nodemcu-32s, mhetesp32minikit
Component causing the issue
http_request
Example YAML snippet
Anything in the logs that might be useful for us?
Additional information
No response