Closed ygelt closed 11 months ago
I've seen cases of corrupted response buffer. I can't remember if it was exclusively on ESP esp32 but, yes it might be the case. It has been in my to do list for months I need to put it back on top of the list
May not be related, by recently another case of broken WebQuery
responses was mentioned. IDK if the investigation got further.
https://discord.com/channels/479389167382691863/790182324939128832/1175492686833266739
I get some funny corruptions where part of the previous log message got it's way in the WebQuery response
18:04:00.948 CMD: backlog rule1 1 ; rule2 1
18:04:00.982 MQT: stat/esp32dev3/RULE = {"Rule1":{"State":"ON","Once":"OFF","StopOnError":"OFF","Length":66,"Free":445,"Rules":"on tele-Time do webquery http://192.168.168.244/test.php GET endon"}}
18:04:01.210 MQT: stat/esp32dev3/RULE = {"Rule2":{"State":"ON","Once":"OFF","StopOnError":"OFF","Length":38,"Free":473,"Rules":"on WebQuery#Time do var1 %value% endon"}}
18:04:03.964 MQT: tele/esp32dev3/STATE = {"Time":"2023-11-26T18:04:03+01:00","Uptime":"0T00:01:35","UptimeSec":95,"Heap":169,"SleepMode":"Dynamic","Sleep":50,"LoadAvg":19,"MqttCount":1,"Wifi":{"AP":1,"SSId":"BNETWLAN","BSSId":"88:C3:97:F2:25:EC","Channel":1,"Mode":"11n","RSSI":100,"Signal":-45,"LinkCount":1,"Downtime":"0T00:00:10"}}
18:04:03.996 RUL: TELE-TIME performs "webquery http://192.168.168.244/test.php GET"
18:04:04.031 {"Time":"2023-11-26T18:04:04+01:00"}
18:04:04.032 AssumeJSON=1
18:04:04.039 MQT: stat/esp32dev3/WEBQUERY = {"WebQuery":{ssumeJSON=1}
18:04:04.053 MQT: stat/esp32dev3/WEBQUERY = {"WebQuery":"Done"}
18:04:13.927 MQT: tele/esp32dev3/STATE = {"Time":"2023-11-26T18:04:13+01:00","Uptime":"0T00:01:45","UptimeSec":105,"Heap":170,"SleepMode":"Dynamic","Sleep":50,"LoadAvg":19,"MqttCount":1,"Wifi":{"AP":1,"SSId":"BNETWLAN","BSSId":"88:C3:97:F2:25:EC","Channel":1,"Mode":"11n","RSSI":100,"Signal":-50,"LinkCount":1,"Downtime":"0T00:00:10"}}
18:04:13.960 RUL: TELE-TIME performs "webquery http://192.168.168.244/test.php GET"
18:04:14.002 {"Time":"2023-11-26T18:04:14+01:00"}
18:04:14.004 AssumeJSON=1
18:04:14.010 MQT: stat/esp32dev3/WEBQUERY = {"WebQuery":{ssumeJSON=1}
18:04:14.021 MQT: stat/esp32dev3/WEBQUERY = {"WebQuery":"Done"}
18:04:23.946 MQT: tele/esp32dev3/STATE = {"Time":"2023-11-26T18:04:23+01:00","Uptime":"0T00:01:55","UptimeSec":115,"Heap":170,"SleepMode":"Dynamic","Sleep":50,"LoadAvg":19,"MqttCount":1,"Wifi":{"AP":1,"SSId":"BNETWLAN","BSSId":"88:C3:97:F2:25:EC","Channel":1,"Mode":"11n","RSSI":100,"Signal":-50,"LinkCount":1,"Downtime":"0T00:00:10"}}
18:04:23.983 RUL: TELE-TIME performs "webquery http://192.168.168.244/test.php GET"
18:04:24.029 {"Time":"2023-11-26T18:04:24+01:00"}
18:04:24.031 AssumeJSON=1
18:04:24.038 MQT: stat/esp32dev3/WEBQUERY = {"WebQuery":{ssumeJSON=1}
18:04:24.049 MQT: stat/esp32dev3/WEBQUERY = {"WebQuery":"Done"}
18:04:33.925 MQT: tele/esp32dev3/STATE = {"Time":"2023-11-26T18:04:33+01:00","Uptime":"0T00:02:05","UptimeSec":125,"Heap":170,"SleepMode":"Dynamic","Sleep":50,"LoadAvg":19,"MqttCount":1,"Wifi":{"AP":1,"SSId":"BNETWLAN","BSSId":"88:C3:97:F2:25:EC","Channel":1,"Mode":"11n","RSSI":98,"Signal":-51,"LinkCount":1,"Downtime":"0T00:00:10"}}
18:04:33.958 RUL: TELE-TIME performs "webquery http://192.168.168.244/test.php GET"
18:04:34.022 {"Time":"2023-11-26T18:04:34+01:00"}
18:04:34.024 AssumeJSON=1
18:04:34.033 MQT: stat/esp32dev3/WEBQUERY = {"WebQuery":{"Time":"2023-11-26T18:04:34+01:00"}}
18:04:34.049 MQT: stat/esp32dev3/WEBQUERY = {"WebQuery":"Done"}
18:04:43.921 MQT: tele/esp32dev3/STATE = {"Time":"2023-11-26T18:04:43+01:00","Uptime":"0T00:02:15","UptimeSec":135,"Heap":170,"SleepMode":"Dynamic","Sleep":50,"LoadAvg":19,"MqttCount":1,"Wifi":{"AP":1,"SSId":"BNETWLAN","BSSId":"88:C3:97:F2:25:EC","Channel":1,"Mode":"11n","RSSI":96,"Signal":-52,"LinkCount":1,"Downtime":"0T00:00:10"}}
18:04:43.954 RUL: TELE-TIME performs "webquery http://192.168.168.244/test.php GET"
18:04:43.993 {"Time":"2023-11-26T18:04:44+01:00"}
18:04:43.994 AssumeJSON=1
18:04:44.001 MQT: stat/esp32dev3/WEBQUERY = {"WebQuery":{ssumeJSON=1}
18:04:44.014 MQT: stat/esp32dev3/WEBQUERY = {"WebQuery":"Done"}
18:04:53.937 MQT: tele/esp32dev3/STATE = {"Time":"2023-11-26T18:04:53+01:00","Uptime":"0T00:02:25","UptimeSec":145,"Heap":170,"SleepMode":"Dynamic","Sleep":50,"LoadAvg":19,"MqttCount":1,"Wifi":{"AP":1,"SSId":"BNETWLAN","BSSId":"88:C3:97:F2:25:EC","Channel":1,"Mode":"11n","RSSI":92,"Signal":-54,"LinkCount":1,"Downtime":"0T00:00:10"}}
18:04:53.970 RUL: TELE-TIME performs "webquery http://192.168.168.244/test.php GET"
18:04:54.007 {"Time":"2023-11-26T18:04:54+01:00"}
18:04:54.008 AssumeJSON=1
18:04:54.016 MQT: stat/esp32dev3/WEBQUERY = {"WebQuery":{ssumeJSON=1}
18:04:54.032 MQT: stat/esp32dev3/WEBQUERY = {"WebQuery":"Done"}
18:05:03.954 MQT: tele/esp32dev3/STATE = {"Time":"2023-11-26T18:05:03+01:00","Uptime":"0T00:02:35","UptimeSec":155,"Heap":170,"SleepMode":"Dynamic","Sleep":50,"LoadAvg":19,"MqttCount":1,"Wifi":{"AP":1,"SSId":"BNETWLAN","BSSId":"88:C3:97:F2:25:EC","Channel":1,"Mode":"11n","RSSI":96,"Signal":-52,"LinkCount":1,"Downtime":"0T00:00:10"}}
18:05:03.988 RUL: TELE-TIME performs "webquery http://192.168.168.244/test.php GET"
18:05:04.229 {"Time":"2023-11-26T18:05:04+01:00"}
18:05:04.231 AssumeJSON=1
18:05:04.238 MQT: stat/esp32dev3/WEBQUERY = {"WebQuery":{}
18:05:04.251 MQT: stat/esp32dev3/WEBQUERY = {"WebQuery":"Done"}
18:05:13.923 MQT: tele/esp32dev3/STATE = {"Time":"2023-11-26T18:05:13+01:00","Uptime":"0T00:02:45","UptimeSec":165,"Heap":170,"SleepMode":"Dynamic","Sleep":50,"LoadAvg":19,"MqttCount":1,"Wifi":{"AP":1,"SSId":"BNETWLAN","BSSId":"88:C3:97:F2:25:EC","Channel":1,"Mode":"11n","RSSI":96,"Signal":-52,"LinkCount":1,"Downtime":"0T00:00:10"}}
18:05:13.957 RUL: TELE-TIME performs "webquery http://192.168.168.244/test.php GET"
18:05:14.035 {"Time":"2023-11-26T18:05:14+01:00"}
18:05:14.036 AssumeJSON=1
18:05:14.045 MQT: stat/esp32dev3/WEBQUERY = {"WebQuery":{"Time":"2023-11-26T18:05:14+01:00"}}
18:05:14.063 MQT: stat/esp32dev3/WEBQUERY = {"WebQuery":"Done"}
18:05:23.934 MQT: tele/esp32dev3/STATE = {"Time":"2023-11-26T18:05:23+01:00","Uptime":"0T00:02:55","UptimeSec":175,"Heap":170,"SleepMode":"Dynamic","Sleep":50,"LoadAvg":19,"MqttCount":1,"Wifi":{"AP":1,"SSId":"BNETWLAN","BSSId":"88:C3:97:F2:25:EC","Channel":1,"Mode":"11n","RSSI":100,"Signal":-47,"LinkCount":1,"Downtime":"0T00:00:10"}}
18:05:23.967 RUL: TELE-TIME performs "webquery http://192.168.168.244/test.php GET"
18:05:24.203 {"Time":"2023-11-26T18:05:24+01:00"}
18:05:24.204 AssumeJSON=1
18:05:24.212 MQT: stat/esp32dev3/WEBQUERY = {"WebQuery":{ssumeJSON=xV���?�d}
18:05:24.226 MQT: stat/esp32dev3/WEBQUERY = {"WebQuery":"Done"}
18:05:33.951 MQT: tele/esp32dev3/STATE = {"Time":"2023-11-26T18:05:33+01:00","Uptime":"0T00:03:05","UptimeSec":185,"Heap":170,"SleepMode":"Dynamic","Sleep":50,"LoadAvg":19,"MqttCount":1,"Wifi":{"AP":1,"SSId":"BNETWLAN","BSSId":"88:C3:97:F2:25:EC","Channel":1,"Mode":"11n","RSSI":100,"Signal":-48,"LinkCount":1,"Downtime":"0T00:00:10"}}
18:05:33.984 RUL: TELE-TIME performs "webquery http://192.168.168.244/test.php GET"
18:05:34.018 {"Time":"2023-11-26T18:05:34+01:00"}
18:05:34.019 AssumeJSON=1
18:05:34.027 MQT: stat/esp32dev3/WEBQUERY = {"WebQuery":{ssumeJSON=xV���?,d}
18:05:34.041 MQT: stat/esp32dev3/WEBQUERY = {"WebQuery":"Done"}
18:05:43.965 MQT: tele/esp32dev3/STATE = {"Time":"2023-11-26T18:05:43+01:00","Uptime":"0T00:03:15","UptimeSec":195,"Heap":170,"SleepMode":"Dynamic","Sleep":50,"LoadAvg":19,"MqttCount":1,"Wifi":{"AP":1,"SSId":"BNETWLAN","BSSId":"88:C3:97:F2:25:EC","Channel":1,"Mode":"11n","RSSI":100,"Signal":-49,"LinkCount":1,"Downtime":"0T00:00:10"}}
18:05:44.000 RUL: TELE-TIME performs "webquery http://192.168.168.244/test.php GET"
18:05:44.043 {"Time":"2023-11-26T18:05:44+01:00"}
18:05:44.045 AssumeJSON=1
18:05:44.053 MQT: stat/esp32dev3/WEBQUERY = {"WebQuery":{ssumeJSON=xV���?`d}
18:05:44.066 MQT: stat/esp32dev3/WEBQUERY = {"WebQuery":"Done"}
18:05:53.940 MQT: tele/esp32dev3/STATE = {"Time":"2023-11-26T18:05:53+01:00","Uptime":"0T00:03:25","UptimeSec":205,"Heap":170,"SleepMode":"Dynamic","Sleep":50,"LoadAvg":19,"MqttCount":1,"Wifi":{"AP":1,"SSId":"BNETWLAN","BSSId":"88:C3:97:F2:25:EC","Channel":1,"Mode":"11n","RSSI":100,"Signal":-49,"LinkCount":1,"Downtime":"0T00:00:10"}}
18:05:53.976 RUL: TELE-TIME performs "webquery http://192.168.168.244/test.php GET"
18:05:54.009 {"Time":"2023-11-26T18:05:54+01:00"}
18:05:54.010 AssumeJSON=1
18:05:54.017 MQT: stat/esp32dev3/WEBQUERY = {"WebQuery":{ssumeJSON=xV���?�d}
18:05:54.031 MQT: stat/esp32dev3/WEBQUERY = {"WebQuery":"Done"}
@arendst I believe we've found a memory coruption problem around Response_P()
and AddLog()
functions in WebQuery Reponse. Probably pointing to ext_vsnprintf_P(...)
.
The problem occurs only on ESP32, not on ESP8266.
I've seen such behavior for quite some time, months is not years but didn't had the time to investigate further before.
I have added some serial log here: https://github.com/barbudor/Tasmota/blob/5d2a8d1c792305e18e6b7e74be3a40ac284396f3/tasmota/tasmota_xdrv_driver/xdrv_01_9_webserver.ino#L3357-L3383
When it works fine and there is no corruption, it looks like:
19:03:52.946 RUL: TELE-TIME performs "webquery http://192.168.168.244/test.php GET"
a={"Time":"2023-11-26T19:03:53+01:00"}
b={"Time":"2023-11-26T19:03:53+01:00"}
c={"Time":"2023-11-26T19:03:53+01:00"}
y={"Time":"2023-11-26T19:03:53+01:00"}
z={"Time":"2023-11-26T19:03:53+01:00"}
19:03:52.000 MQT: stat/esp32dev3/WEBQUERY = {"WebQuery":{"Time":"2023-11-26T19:03:53+01:00"}}
19:03:53.015 MQT: stat/esp32dev3/WEBQUERY = {"WebQuery":"Done"}
But randomly (often) it shows things like:
18:58:42.951 RUL: TELE-TIME performs "webquery http://192.168.168.244/test.php GET"
a={"Time":"2023-11-26T18:58:43+01:00"}
b={"Time":"2023-11-26T18:58:43+01:00"}
c={"WebQuery":
y={"WebQuery":
z={"WebQuery":
18:58:42.993 MQT: stat/esp32dev3/WEBQUERY = {"WebQuery":{"WebQuery":}
18:58:43.017 MQT: stat/esp32dev3/WEBQUERY = {"WebQuery":"Done"}
or like:
19:02:02.983 RUL: TELE-TIME performs "webquery http://192.168.168.244/test.php GET"
a={"Time":"2023-11-26T19:02:03+01:00"}
b={"Time":"2023-11-26T19:02:03+01:00"}
c={"Time":"2023-11-26T19:02:03+01:00"}
y=
z=
19:02:03.022 MQT: stat/esp32dev3/WEBQUERY = {"WebQuery":{"Time":"2<}
19:02:03.041 MQT: stat/esp32dev3/WEBQUERY = {"WebQuery":"Done"}
It's not related to the length of the HTTP response which here is quite short (less than 40 bytes).
IMHO, this makes obvious that the buffer holding the WebQuery response (read0
in the code linked) gets corrupted when calling Response_P(...)
. The same problems also affect AddLog(...)
because before using Serial.print
I was using AddLog and the log message was written in the WebQuery MQTT response.
What happens if you copy the result from http.getstring() into a new variable? It might well be the http result is freed by garbage collection before the rest of the code is finished....
It looks better. No corruption so far I need to find a way to get a proper test running for the night
PR sent
PROBLEM DESCRIPTION
I use the same script on 4 tasmota devices : 3 with 8266, 1 with ESP32 which are all querying the same web service every 3 seconds using WebQuery GET. This works perfectly on the 8266 devices, I mean they receives the web service answer on each call; while the ESP32 does not receive the data most of the time, quite erratically. My web service is correctly called by the device (correct headers and parameter).
It "receives the answer" means it logs like this : 23:49:00.760 RSL: RESULT = {"WebQuery":{"cmd":0, "con":0, "alv":172297}} 23:49:00.771 RSL: RESULT = {"WebQuery":"Done"}
It "does not receive the answer" means it logs like this : 23:46:34.111 RSL: RESULT = {"WebQuery":{"WebQuery":} 23:46:34.120 RSL: RESULT = {"WebQuery":"Done"}
The calling WebQuery is: =>WebQuery http://www.xxxx.xxxx/connect/v1/device?al=%0nWS% GET [id:%topic%|ip:%lip%] (The issue is the same in http and https)
Resulting in log: 00:09:07.260 CMD: Grp 0, Cmd 'WEBQUERY', Idx 1, Len 107, Pld -99, Data 'http://www.xxxx.xxxx/connect/v1/device?al=29236 GET [id:01234567890123456789012345678901|ip:192.168.0.38]'
REQUESTED INFORMATION
Make sure your have performed every step and checked the applicable boxes before submitting your issue. Thank you!
Backlog Template; Module; GPIO 255
:Status 0
:TO REPRODUCE
Steps to reproduce the behavior: I can send my script in mp for a real-time test
EXPECTED BEHAVIOUR
A clear and concise description of what you expected to happen. I expect this answer as on 8266 devices: 23:49:00.760 RSL: RESULT = {"WebQuery":{"cmd":0, "con":0, "alv":172297, "cf1":2, "cf2":3, "cf3":4, "cf4":4, "cf5":20}}
SCREENSHOTS
If applicable, add screenshots to help explain your problem.
ADDITIONAL CONTEXT
Add any other context about the problem here. I compiled Tasmota 13.0.0 using TasmoCompiler including Script and TLS MQTT support but not Berry. Here are my user_config_override.h options:
ifndef ESP32
define ESP32
endif
ifdef USE_BERRY
undef USE_BERRY
endif
ifdef SCRIPT_MAXSSIZE
undef SCRIPT_MAXSSIZE
endif
define SCRIPT_MAXSSIZE 64
ifndef USE_SCRIPT
define USE_SCRIPT
endif
ifndef USE_WEBSEND_RESPONSE
define USE_WEBSEND_RESPONSE
endif
ifdef USE_RULES
undef USE_RULES
endif
define PRECONFIGURED_SCRIPT ">D 128\n;--- start setup ---\n;......"
ifndef START_SCRIPT_FROM_BOOT
define START_SCRIPT_FROM_BOOT
endif
(Please, remember to close the issue when the problem has been addressed)