Closed puterboy closed 2 months ago
The version that works (dev branch from March 2024) interestingly uses a newer version of the library rtl_433_ESP (rtl_433_ESP.git#v0.3.2 than v175 (rtl_433_ESP.git#v0.3.1).
The only seemingly significant differences (beyond spelling corrections in comments) are as follows:
--- OpenMQTTGateway-old/.pio/libdeps/lilygo-rtl_433-jjk/rtl_433_ESP/src/rtl_433/r_api.c 2024-03-13 23:51:45.000000000 -0400
+++ OpenMQTTGateway-release/.pio/libdeps/lilygo-rtl_433-jjk-ota/rtl_433_ESP/src/rtl_433/r_api.c 2024-08-18 18:35:00.104503500 -0400
@@ -798,12 +798,10 @@
else if ((d->type == DATA_DOUBLE) &&
(str_endswith(d->key, "_in") || str_endswith(d->key, "_inch"))) {
d->value.v_dbl = inch2mm(d->value.v_dbl);
- // need to free ptr returned from str_replace
- char* new_label1 = str_replace(d->key, "_inch", "_in");
- char* new_label2 = str_replace(new_label1, "_in", "_mm");
- free(new_label1);
+ char* new_label =
+ str_replace(str_replace(d->key, "_inch", "_in"), "_in", "_mm");
free(d->key);
- d->key = new_label2;
+ d->key = new_label;
char* new_format_label = str_replace(d->format, "in", "mm");
free(d->format);
d->format = new_format_label;
And
--- OpenMQTTGateway-old/.pio/libdeps/lilygo-rtl_433-jjk/rtl_433_ESP/src/rtl_433_ESP.cpp 2024-03-13 23:51:45.000000000 -0400
+++ OpenMQTTGateway-release/.pio/libdeps/lilygo-rtl_433-jjk-ota/rtl_433_ESP/src/rtl_433_ESP.cpp 2024-08-18 18:35:00.104503500 -0400
@@ -32,12 +32,8 @@
#if defined(RF_MODULE_SCK) && defined(RF_MODULE_MISO) && \
defined(RF_MODULE_MOSI) && defined(RF_MODULE_CS)
# include <SPI.h>
-# if CONFIG_IDF_TARGET_ESP32C3 || CONFIG_IDF_TARGET_ESP32S3
-SPIClass newSPI(FSPI);
-# else
SPIClass newSPI(VSPI);
# endif
-#endif
#ifdef RF_SX1276
SX1276 radio = RADIO_LIB_MODULE;
@@ -59,14 +55,14 @@
/*----------------------------- rtl_433_ESP Internals -----------------------------*/
-#define rtl_433_ReceiverTask_Stack 2048
+#define rtl_433_ReceiverTask_Stack 2000
#define rtl_433_ReceiverTask_Priority 2
#define rtl_433_ReceiverTask_Core 0
-/*----------------------------- Initialize variables -----------------------------*/
+/*----------------------------- Initalize variables -----------------------------*/
/**
- * Is the receiver currently receiving a signal
+ * Is the receiver currently receving a signal
*/
static bool receiveMode = false;
Could the change in ReceiverTask_Stack
be causing the problem due perhaps to overflow???
After all I do have a couple of dozen sensors...
It could be interesting to increase your receiver task Stack and see if you eliminate the message corruption.
I can try increasing the ReceiverTask_Stack
but as per the above mention, it seems like the change in ZgatewayRTL_433.ino
that fixes https://github.com/1technophile/OpenMQTTGateway/issues/2012, causes these sporadic spikes to occur.
Specifically, if you are on the development branch with
//RFrtl_433_ESPdata["origin"] = (char*)topic.c_str();
//handleJsonEnqueue(RFrtl_433_ESPdata);
pub(topic.c_str(), RFrtl_433_ESPdata);
Then I don't get sporadic corruption of the actual MQTT message data but instead get occasional corruption of the Discovery config data.
Conversely, if I use the version from the v175 branch
RFrtl_433_ESPdata["origin"] = (char*)topic.c_str();
handleJsonEnqueue(RFrtl_433_ESPdata);
\\
Then the Discovery config lines are not corrupted but I get sporadic corruption of the MQTT data messages.
//RFrtl_433_ESPdata["origin"] = (char*)topic.c_str();
//handleJsonEnqueue(RFrtl_433_ESPdata); pub(topic.c_str(), RFrtl_433_ESPdata);
Interesting, I introduced this change and fixed data corruption for numerous people using RTL_433 https://github.com/1technophile/OpenMQTTGateway/issues/1836
It does indeed fix data corruption but somehow corrupts discovery topics... Not sure why yet. I'm thinking problem may be more fundamental...
On August 25, 2024 4:30:03 PM EDT, Florian @.***> wrote:
//RFrtl_433_ESPdata["origin"] = (char*)topic.c_str();
//handleJsonEnqueue(RFrtl_433_ESPdata); pub(topic.c_str(), RFrtl_433_ESPdata);
Interesting, I introduced this change and fixed data corruption for numerous people using RTL_433 https://github.com/1technophile/OpenMQTTGateway/issues/1836
-- Reply to this email directly or view it on GitHub: https://github.com/1technophile/OpenMQTTGateway/issues/2014#issuecomment-2308983399 You are receiving this because you authored the thread.
Message ID: @.***>
Sent from my Pixel5a with K-9 Mail
I subscribed to the topics and the nonsensical temperature or battery spikes correspond to corrupted JSON data. For example:
home/OMG_lilygo_rtl_433_ESP/RTL_433toMQTT/LaCrosse-TX141THBv2/0/216 {"model":"Nexus-TH","battery_ok":216,"tery_ok":0,"temperature_C":1,"_C":2.8,"otocol":10,"xus, FreeTec NC-7345, NX-3980, Solight TE82S, TFA 30.3209 temperature/humidity sensor":"FreeTec NC-7345, NX-3980, Solight TE82S, TFA 30.3209 temperature/humidity sensor","eTec NC-7345, NX-3980, Solight TE82S, TFA 30.3209 temperature/humidity sensor":" NC-7345, NX-3980, Solight TE82S, TFA 30.3209 temperature/humidity sensor","7345, NX-3980, Solight TE82S, TFA 30.3209 temperature/humidity sensor":"3980, Solight TE82S, TFA 30.3209 temperature/humidity sensor","dity sensor\",\"rssi\":-64,\"duration\":997973}":-64,"sensor\",\"rssi\":-64,\"duration\":997973}":997973}
The data should look something like:
home/OMG_lilygo_rtl_433_ESP/RTL_433toMQTT/LaCrosse-TX141THBv2/0/216 {"model":"LaCrosse-TX141THBv2","id":216,"channel":0,"battery_ok":1,"temperature_C":3.2,"humidity":10,"test":"No","mic":"CRC","protocol":"LaCrosse TX141-Bv2, TX141TH-Bv2, TX141-Bv3, TX141W, TX145wsdth, (TFA, ORIA) sensor","rssi":-65,"duration":141000}
My guess, again, is that this is due to improper string allocation leading to improper string termination...
Again this corruption occurs when you use the code from v175
RFrtl_433_ESPdata["origin"] = (char*)topic.c_str();
handleJsonEnqueue(RFrtl_433_ESPdata);
Could be a concurrency issue in the queue, discovery topics are quite heavy in terms of number and length and having them mixed in the queue with the regular messages may be the issue.
I actually think that is the root cause of both the corruption of discovery topics and the sporadic corruption of the MQTT topic messages -- they both seem to occur when the queue is "overloaded". Perhaps not enough memory is being allocated for the queue or for its individual elements. That would explain a lot...
That might also explain the crashes that you fixed by moving publishing of data out of the queue https://github.com/1technophile/OpenMQTTGateway/issues/1836.
I think the queue is a good thing, and we should fix that rather than trying to work around it.
BTW, happy to jump on a call or chat to work on this together as I am coming up to the limits of my abilities here...
I'm stumped here as I can't seem to find a reason why queue would corrupt or overflow:
So maybe it is a concurrency issue since this seems to be multi-threaded (though I literally know nothing about how to program multiple threads). Could it be that there are multiple simultaneous calls to mqtt->publish that collide? At least for discovery topics this doesn't seem to be protected with semaphores (if I am understanding this correctly)
Any thoughts on how to troubleshoot this?
Indeed it is a concurrency issue. I was able to get rid of discovery config data corruption by wrapping mqtt->publish with a Mutex semaphore within the low level pubMQTT routine.
This may also solve the MQTT topic corruption issues and allow you to revert the change referenced above.
I will test and publish a PR.
(BTW, per the other bug report https://github.com/1technophile/OpenMQTTGateway/issues/2023, I think it's still separately important to back off on the OLED display delay as the queue fills)
Unfortunately, I still get MQTT topic corruption if I use the JsonEnqueue method -- not sure why. So I will submit the PR of just the Mutex wrapper, keeping your revision intact.
Technically, that for me at least seems to get the development branch working fine, though I would still like to understand why the JsonEnqueue method leads to corruption.
I was hoping that the 2 PRs I created to add protection to mqtt->publish/mqtt->loop (https://github.com/1technophile/OpenMQTTGateway/pull/2024) and to emptyQueue (https://github.com/1technophile/OpenMQTTGateway/pull/2025) would solve the corruption problem when jsonQueue is used for MQTT messages but I still get sporadic corruption in the posted MQTT messages which would seem to be concurrency issues too but I can't figure out where the issue may be.
Here are 2 successive examples of corruption for an Ambientweather and a Rubicson temp/humidity sensor sandwiched between 2 valid messages from a Nexus and LaCrosse sensor respectively, that I captured with mosquitto_sub
:
home/OMG_lilygo_rtl_433_ESP/RTL_433toMQTT/Nexus-TH/1/180 {"model":"Nexus-TH","id":180,"channel":1,"battery_ok":1,"temperature_C":21.6,"humidity":70,"protocol":"Nexus, FreeTec NC-73
45, NX-3980, Solight TE82S, TFA 30.3209 temperature/humidity sensor","rssi":-59,"duration":729997}
home/OMG_lilygo_rtl_433_ESP/RTL_433toMQTT/Ambientweather-F007TH/3/179 {"model":"Solight-TE44","l":179,"emperature_C":3,"re_C":1,"C":19.22222,"ight TE44/TE66, EMOS E0107T, NX-6876-9
17":74,"/TE66, EMOS E0107T, NX-6876-917":"6, EMOS E0107T, NX-6876-917","MOS E0107T, NX-6876-917":"T, NX-6876-917","sor\",\"rssi\":-92,\"duration\":1149996}":-92,"\"rssi\":-92,\"dur
ation\":1149996}":1149996}
home/OMG_lilygo_rtl_433_ESP/RTL_433toMQTT/Rubicson-Temperature/1/222 {"model":"Solight-TE44","el":222,"temperature_C":1,"ure_C":1,"RC":18.3,"light TE44/TE66, EMOS E0107T, NX-6876-917":"t TE44/TE66, EMOS E0107T, NX-6876-917","44/TE66, EMOS E0107T, NX-6876-917":"EMOS E0107T, NX-6876-917",":-92,\"duration\":1149996}":-92,"\"duration\":1149996}":1149996}
home/OMG_lilygo_rtl_433_ESP/RTL_433toMQTT/LaCrosse-TX141THBv2/1/26 {"model":"LaCrosse-TX141THBv2","id":26,"channel":1,"battery_ok":1,"temperature_C":-21.4,"humidity":10,"test":"No","mic":"CRC","protocol":"LaCrosse TX141-Bv2, TX141TH-Bv2, TX141-Bv3, TX141W, TX145wsdth, (TFA, ORIA) sensor","rssi":-59,"duration":136996}
Interestingly, all the Rubicson messages are corrupted but only very occasional ones from other sensors like the Ambientweather one shown here or from the Nexus or LaCrosse sensors that are not corrupted here.
Note: the corruptions occur when I use handleJsonEnqueue
to post MQTT messages (rather than the updated version that posts directly using 'pub' -- because I want to fix the underlying corruption problem that seems to exist in the code for the jsonQueue
stack.
I should note that the max queue length achieved is 4 and there are no blocked messages.
Try to increase the task stack associated with RTL_433 rtl_433_Decoder_Stack
, if I recall well this helped during my testing.
We also added a mutex but went to the same conclusion as you.
Maybe I am missing something, but I don't see how increasing rtl_433_Decoder_Stack
will help.
The problem seems to be limited to how the JSON string is published.
pub
directly as per your latest update, it never failshandleJsonEnqueue
it fails sporadically as aboveSo it would seem that the decoding is just fine -- it's an MQTT publishing issue.
Even if increasing rtl_433_Decoder_Stack
somehow helps, wouldn't it simply be papering over the underlying problem with jsonQueue
?
It's worth a try considering your message-heavy setup. If it helps, at least it gives a direction towards memory allocation versus concurrency.
I added some more logging to show exactly what is being enqueued and dequeued. The following log shows that the right data is entering the queue but (sometimes) it gets corrupted when there are 2 enqueues in a row.
N: type: null
N: Enqueue JSON: {"model":"LaCrosse-TX141THBv2","id":26,"channel":1,"battery_ok":1,"temperature_C":-20.8,"humidity":10,"test":"No","mic":"CRC","protocol":"LaCrosse TX141-Bv2, TX141TH-Bv2, TX141-Bv3, TX141W, TX145wsdth, (TFA, ORIA) sensor","rssi":-57,"duration":138996,"origin":"/RTL_433toMQTT/LaCrosse-TX141THBv2/1/26"}
N: Dequeue JSON: {"model":"LaCrosse-TX141THBv2","id":26,"channel":1,"battery_ok":1,"temperature_C":-20.8,"humidity":10,"test":"No","mic":"CRC","protocol":"LaCrosse TX141-Bv2, TX141TH-Bv2, TX141-Bv3, TX141W, TX145wsdth, (TFA, ORIA) sensor","rssi":-57,"duration":138996,"origin":"/RTL_433toMQTT/LaCrosse-TX141THBv2/1/26"}
N: Send on /RTL_433toMQTT/LaCrosse-TX141THBv2/1/26 msg {"model":"LaCrosse-TX141THBv2","id":26,"channel":1,"battery_ok":1,"temperature_C":-20.8,"humidity":10,"test":"No","mic":"CRC","protocol":"LaCrosse TX141-Bv2, TX141TH-Bv2, TX141-Bv3, TX141W, TX145wsdth, (TFA, ORIA) sensor","rssi":-57,"duration":138996}
N: type: null
N: type: null
N: Enqueue JSON: {"model":"Ambientweather-F007TH","id":178,"channel":2,"battery_ok":1,"temperature_C":22.94445,"humidity":74,"mic":"CRC","protocol":"Ambient Weather F007TH, TFA 30.3208.02, SwitchDocLabs F016TH temperature sensor","rssi":-57,"duration":188996,"origin":"/RTL_433toMQTT/Ambientweather-F007TH/2/178"}
N: Dequeue JSON: {"model":"Ambientweather-F007TH","id":178,"channel":2,"battery_ok":1,"temperature_C":22.94445,"humidity":74,"mic":"CRC","protocol":"Ambient Weather F007TH, TFA 30.3208.02, SwitchDocLabs F016TH temperature sensor","rssi":-57,"duration":188996,"origin":"/RTL_433toMQTT/Ambientweather-F007TH/2/178"}
N: Send on /RTL_433toMQTT/Ambientweather-F007TH/2/178 msg {"model":"Ambientweather-F007TH","id":178,"channel":2,"battery_ok":1,"temperature_C":22.94445,"humidity":74,"mic":"CRC","protocol":"Ambient Weather F007TH, TFA 30.3208.02, SwitchDocLabs F016TH temperature sensor","rssi":-57,"duration":188996}
N: type: null
N: Enqueue JSON: {"model":"LaCrosse-TX141THBv2","id":83,"channel":0,"battery_ok":1,"temperature_C":4.2,"humidity":71,"test":"No","mic":"CRC","protocol":"LaCrosse TX141-Bv2, TX141TH-Bv2, TX141-Bv3, TX141W, TX145wsdth, (TFA, ORIA) sensor","rssi":-70,"duration":906996,"origin":"/RTL_433toMQTT/LaCrosse-TX141THBv2/0/83"}
N: type: null
N: Enqueue JSON: {"model":"Nexus-TH","id":13,"channel":2,"battery_ok":1,"temperature_C":24.1,"humidity":67,"protocol":"Nexus, FreeTec NC-7345, NX-3980, Solight TE82S, TFA 30.3209 temperature/humidity sensor","rssi":-70,"duration":906996,"origin":"/RTL_433toMQTT/Nexus-TH/2/13"}
N: Dequeue JSON: {"model":"Nexus-TH","battery_ok":83,"tery_ok":0,"temperature_C":1,"_C":4.2,"otocol":71,"xus, FreeTec NC-7345, NX-3980, Solight TE82S, TFA 30.3209 temperature/humidity sensor":"FreeTec NC-7345, NX-3980, Solight TE82S, TFA 30.3209 temperature/humidity sensor","eTec NC-7345, NX-3980, Solight TE82S, TFA 30.3209 temperature/humidity sensor":" NC-7345, NX-3980, Solight TE82S, TFA 30.3209 temperature/humidity sensor","7345, NX-3980, Solight TE82S, TFA 30.3209 temperature/humidity sensor":"3980, Solight TE82S, TFA 30.3209 temperature/humidity sensor","ity sensor\",\"rssi\":-70,\"duration\":906996}":-70,"ensor\",\"rssi\":-70,\"duration\":906996}":906996,"origin":"/RTL_433toMQTT/LaCrosse-TX141THBv2/0/83"}
N: Send on /RTL_433toMQTT/LaCrosse-TX141THBv2/0/83 msg {"model":"Nexus-TH","battery_ok":83,"tery_ok":0,"temperature_C":1,"_C":4.2,"otocol":71,"xus, FreeTec NC-7345, NX-3980, Solight TE82S, TFA 30.3209 temperature/humidity sensor":"FreeTec NC-7345, NX-3980, Solight TE82S, TFA 30.3209 temperature/humidity sensor","eTec NC-7345, NX-3980, Solight TE82S, TFA 30.3209 temperature/humidity sensor":" NC-7345, NX-3980, Solight TE82S, TFA 30.3209 temperature/humidity sensor","7345, NX-3980, Solight TE82S, TFA 30.3209 temperature/humidity sensor":"3980, Solight TE82S, TFA 30.3209 temperature/humidity sensor","ity sensor\",\"rssi\":-70,\"duration\":906996}":-70,"ensor\",\"rssi\":-70,\"duration\":906996}":906996}
N: type: null
N: Dequeue JSON: {"model":"LaCrosse-TX141THBv2","TX141THBv2":13,"41THBv2":2,"id":1,"battery_ok":24.1,"perature_C":67,"C":"y","h, (TFA, ORIA) sensor":-70,"FA, ORIA) sensor":906996,"origin":"/RTL_433toMQTT/Nexus-TH/2/13"}
N: Send on /RTL_433toMQTT/Nexus-TH/2/13 msg {"model":"LaCrosse-TX141THBv2","TX141THBv2":13,"41THBv2":2,"id":1,"battery_ok":24.1,"perature_C":67,"C":"y","h, (TFA, ORIA) sensor":-70,"FA, ORIA) sensor":906996}
N: type: null
N: Enqueue JSON: {"model":"Ambientweather-F007TH","id":13,"channel":1,"battery_ok":1,"temperature_C":25.5,"humidity":69,"mic":"CRC","protocol":"Ambient Weather F007TH, TFA 30.3208.02, SwitchDocLabs F016TH temperature sensor","rssi":-60,"duration":188996,"origin":"/RTL_433toMQTT/Ambientweather-F007TH/1/13"}
N: Dequeue JSON: {"model":"Ambientweather-F007TH","id":13,"channel":1,"battery_ok":1,"temperature_C":25.5,"humidity":69,"mic":"CRC","protocol":"Ambient Weather F007TH, TFA 30.3208.02, SwitchDocLabs F016TH temperature sensor","rssi":-60,"duration":188996,"origin":"/RTL_433toMQTT/Ambientweather-F007TH/1/13"}
N: Send on /RTL_433toMQTT/Ambientweather-F007TH/1/13 msg {"model":"Ambientweather-F007TH","id":13,"channel":1,"battery_ok":1,"temperature_C":25.5,"humidity":69,"mic":"CRC","protocol":"Ambient Weather F007TH, TFA 30.3208.02, SwitchDocLabs F016TH temperature sensor","rssi":-60,"duration":188996}
The first two stanzas show a single enqueue followed by a single dequeue -- here the queue length is just 1. Then two enqueues come along followed by 2 dequeues -- both dequeues seem to be scrambled versions of elements of both eqnueues.
Then comes a single enqueue followed by a normal dequeue.
Does this help?
Now here is an example where it goes wrong with just one entry in the queue (the middle stanza is the corrupted one):
N: Enqueue JSON: {"model":"Nexus-TH","id":180,"channel":1,"battery_ok":1,"temperature_C":22.9,"humidity":70,"protocol":"Nexus, FreeTec NC-7345, NX-3980, Solight TE82S, TFA 30.3209 temperature/humidity sensor","rssi":-61,"duration":739996,"origin":"/RTL_433toMQTT/Nexus-TH/1/180"}
N: Dequeue JSON: {"model":"Nexus-TH","id":180,"channel":1,"battery_ok":1,"temperature_C":22.9,"humidity":70,"protocol":"Nexus, FreeTec NC-7345, NX-3980, Solight TE82S, TFA 30.3209 temperature/humidity sensor","rssi":-61,"duration":739996,"origin":"/RTL_433toMQTT/Nexus-TH/1/180"}
N: Send on /RTL_433toMQTT/Nexus-TH/1/180 msg {"model":"Nexus-TH","id":180,"channel":1,"battery_ok":1,"temperature_C":22.9,"humidity":70,"protocol":"Nexus, FreeTec NC-7345, NX-3980, Solight TE82S, TFA 30.3209 temperature/humidity sensor","rssi":-61,"duration":739996}
N: type: null
N: Enqueue JSON: {"model":"Rubicson-Temperature","id":222,"channel":1,"battery_ok":1,"temperature_C":19.7,"mic":"CRC","protocol":"Rubicson, TFA 30.3197 or InFactory PT-310 Temperature Sensor","rssi":-89,"duration":898997,"origin":"/RTL_433toMQTT/Rubicson-Temperature/1/222"}
N: type: null
N: Dequeue JSON: {"model":"Solight-TE44","el":222,"temperature_C":1,"ure_C":1,"RC":19.7,"light TE44/TE66, EMOS E0107T, NX-6876-917":"t TE44/TE66, EMOS E0107T, NX-6876-917","44/TE66, EMOS E0107T, NX-6876-917":"EMOS E0107T, NX-6876-917",":-89,\"duration\":898997}":-89,"\"duration\":898997}":898997,"origin":"/RTL_433toMQTT/Rubicson-Temperature/1/222"}
N: Send on /RTL_433toMQTT/Rubicson-Temperature/1/222 msg {"model":"Solight-TE44","el":222,"temperature_C":1,"ure_C":1,"RC":19.7,"light TE44/TE66, EMOS E0107T, NX-6876-917":"t TE44/TE66, EMOS E0107T, NX-6876-917","44/TE66, EMOS E0107T, NX-6876-917":"EMOS E0107T, NX-6876-917",":-89,\"duration\":898997}":-89,"\"duration\":898997}":898997}
N: type: null
N: Enqueue JSON: {"model":"LaCrosse-TX141THBv2","id":122,"channel":1,"battery_ok":1,"temperature_C":-19.3,"humidity":78,"test":"No","mic":"CRC","protocol":"LaCrosse TX141-Bv2, TX141TH-Bv2, TX141-Bv3, TX141W, TX145wsdth, (TFA, ORIA) sensor","rssi":-64,"duration":143997,"origin":"/RTL_433toMQTT/LaCrosse-TX141THBv2/1/122"}
N: Dequeue JSON: {"model":"LaCrosse-TX141THBv2","id":122,"channel":1,"battery_ok":1,"temperature_C":-19.3,"humidity":78,"test":"No","mic":"CRC","protocol":"LaCrosse TX141-Bv2, TX141TH-Bv2, TX141-Bv3, TX141W, TX145wsdth, (TFA, ORIA) sensor","rssi":-64,"duration":143997,"origin":"/RTL_433toMQTT/LaCrosse-TX141THBv2/1/122"}
N: Send on /RTL_433toMQTT/LaCrosse-TX141THBv2/1/122 msg {"model":"LaCrosse-TX141THBv2","id":122,"channel":1,"battery_ok":1,"temperature_C":-19.3,"humidity":78,"test":"No","mic":"CRC","protocol":"LaCrosse TX141-Bv2, TX141TH-Bv2, TX141-Bv3, TX141W, TX145wsdth, (TFA, ORIA) sensor","rssi":-64,"duration":143997}
Note that the log dequeue is before I have given back the xQueueMutex
that I added to empthyQueue -- both adding and popping from the JsonQueue should be safe.
Actually, I think the problem is with the push
, since the following only pushes a shallow copy of jsonDoc onto the stack.
JsonBundle bundle;
bundle.doc = jsonDoc;
jsonQueue.push(bundle);
May need to push and pop serialized json docs onto the jsonQueue stack.
i.e., serialize -> push -> pop -> deserialize
If this is right then this bug really needs to be fixed since any json doc pushed on the string could be corrupted. I can test this tomorrow
Thanks for the detailled analysis
serialize -> push -> pop -> deserialize
Could be tested for sure.
Tested and solves the problem (at least for me) As mentioned in the PR, this really is a critical bug that should be patched ASAP since it can theoretically corrupt any data that is added to the queue -- even if queue length is one -- so long as some other data object is allocated memory that overlaps with the memory of the queue object.
Fixed as per above PR
I have a bunch of cheap LaCrosse, AmbientWeather, and Nexus temperature & humidity sensors that I read using a LILYGO LORA 433 esp32 running OpenMQTT Gateway compiled under the lilygo-rtl_433 environment.
Everything was working properly until I upgraded to v175. I started noticing that multiple of the sensors would sporadically (without any seeming pattern) transmit erroneous MQTT data sensor data.
For example, my 4 older LaCrosses sensors randomly broadcast a temperature of 2 degC (33.8 degF) regardless of the actual room temperature (note the erroneous value is always exactly 2 deg C). The newer ones are a different model and seem to be OK
Similarly, the battery level would go from a normal 100.0 (corresponding to 100%) to a non-sensical but value that seems different for each sensor (vs. temperature where 33.8 was always the error value). Examples include: 21385.0, 20989.0, 12079.0, 21682.0, 8218.0, 2575.0
In both cases, the erroneous spikes last for only a single reading before returning to normal and then potentially spiking again in an hour or two.
I can't see any pattern in the timing of the spikes or even which sensors are spiking (some of them seem to not be spiking or it could just be that I haven't observed them long enough since the update)
Reverting to the prior version (development branch with last commit on 3/5/23) removes this issue -- so it very much seems to be a problem with the latest code and not with my setup or sensors.
To Reproduce Steps to reproduce the behavior:
Environment (please complete the following information): As above