Closed MichaelDvP closed 10 months ago
This needs further investigation. The memory check is very crude. Of course, there is a slight difference between the memory check and the actual memory afterwards but I definitely not expect it to be 3k.
There might be a bug indeed. A queue item is created here.
Node* node = new (std::nothrow) Node(std::forward<Args>(args) ...);
if (node != nullptr) {
The Node
constructor is here:
explicit Node(Args&&... args)
: data(std::forward<Args>(args) ...)
, next(nullptr) {
// empty
}
Here, the node in the queue can exist but it's data can be non existent. The error is set but the node is not cleared. Later, when trying to send the data, it fails.
Here might be the problem:
if (it && error == espMqttClientTypes::Error::SUCCESS) return true;
return false;
ON failure it just returns false instead of cleaning up the node at iterator.
Regarding queue management. It's undesired to just delete the oldest item. We have to at least check for QoS. I'm happy to delete qos0 messages. Higher QoS means giving up compliancy.
It can be configurable of course: delete oldest regardless delete oldest, taking qos into account don't delete = don't allow new
In any case, it will probably add a lot of complexity.
Of course, there is a slight difference between the memory check and the actual memory afterwards but I definitely not expect it to be 3k.
The memory check works, the "low memory" message from allocate
comes at right level. But the publish still returns ID (qos>0) or 1 (qos0) and heap is decreasing. 7k below limit the "low memory" messages is missing and publish return zero and heap stays constant.
I have not checked if the queue is send out after reconnect, my tests was always with broker down all the time. Because the outbox seems to stay emty, i'll check if the queued messages are sent on reconnect.
Regarding queue management...
Agree, qos is critical when deleting messages. My suggestion: don't accept the new messages, make sure message-id returns zero, so we know the message is not queued. But add a function to delete front/back of queue the the caller can retry after deleting one.
Got it! We'll first do the bug hunting, then discuss features. I'm definitely open for your suggestion.
The reference to error isn't properly updated when allocation fails.
The devil is in the details. Major bug found!
https://github.com/bertmelis/espMqttClient/pull/109/commits/24b6b49a3afa522775cc9869eba808ca4ac54d0b
Some log output:
[ 12326][I][Packet.cpp:316] _allocate(): Alloc (l:78)
messageNo: 599, publish: ok, heap: 158412, maxAlloc: 98292
[ 12342][W][Packet.cpp:306] _allocate(): Packet buffer not allocated: low memory
[ 12349][E][MqttClient.cpp:157] publish(): Could not create PUBLISH packet
messageNo: 600, publish: fail, heap: 158412, maxAlloc: 98292
[ 12367][W][Packet.cpp:306] _allocate(): Packet buffer not allocated: low memory
[ 12373][E][MqttClient.cpp:157] publish(): Could not create PUBLISH packet
messageNo: 601, publish: fail, heap: 158412, maxAlloc: 98292
It now correctly detects the failed allocation. (I didn't change anything on the psram quirks.)
Wow this was fast. Looks much better, but there is still something wrong. The outbox queue is now correctly filled and i added to show it:
Serial.printf("messageNo: %u, publish: %s, queue: %d, heap: %u, maxAlloc: %u\n", messageNo, id > 0 ? "ok" : "fail", mqttClient->getQueue(), ESP.getFreeHeap(), ESP.getMaxAllocHeap());
This gives when reaching the limit correctly the fail (message-id == 0):
[ 14852][I][Packet.cpp:319] _allocate(): Alloc (l:79)
messageNo: 1248, publish: ok, queue: 1242, heap: 61580, maxAlloc: 53236
[ 14863][I][Packet.cpp:319] _allocate(): Alloc (l:79)
messageNo: 1249, publish: ok, queue: 1243, heap: 61420, maxAlloc: 53236
[ 14874][W][Packet.cpp:305] _allocate(): Packet buffer not allocated: low memory
[ 14874][E][MqttClient.cpp:161] publish(): Could not create PUBLISH packet
messageNo: 1250, publish: fail, queue: 1244, heap: 61356, maxAlloc: 53236
[ 14890][W][Packet.cpp:305] _allocate(): Packet buffer not allocated: low memory
[ 14890][E][MqttClient.cpp:161] publish(): Could not create PUBLISH packet
But still the heap goes down and messages are put to queue. Heap usage of a valid message is 160 bytes, heap usage when allocate fail is 64 bytes. And after some time still strange things happens:
messageNo: 1351, publish: fail, queue: 1346, heap: 54828, maxAlloc: 53236
[ 16942][W][Packet.cpp:305] _allocate(): Packet buffer not allocated: low memory
[ 16942][E][MqttClient.cpp:161] publish(): Could not create PUBLISH packet
messageNo: 1352, publish: fail, queue: 1347, heap: 54764, maxAlloc: 53236
Connecting to MQTT...
Connecting failed.
[ 24617][V][WiFiGeneric.cpp:362] _arduino_event_cb(): STA Disconnected: SSID: HZTD67Fr45d, BSSID: 44:4e:6d:af:6f:8f, Reason: 200
[ 24617][D][WiFiGeneric.cpp:1035] _eventCallback(): Arduino Event: 5 - STA_DISCONNECTED
[ 24625][W][WiFiGeneric.cpp:1057] _eventCallback(): Reason: 200 - BEACON_TIMEOUT
[ 24632][D][WiFiGeneric.cpp:1081] _eventCallback(): WiFi AutoReconnect Running
[ 24640][W][Packet.cpp:305] _allocate(): Packet buffer not allocated: low memory
[ 24646][E][MqttClient.cpp:161] publish(): Could not create PUBLISH packet
[ 24642][V][WiFiGeneric.cpp:97] set_esp_interface_ip(): Configuring Station static IP: 0.0.0.0, MASK: 0.0.0.0, GW: 0.0.0.0
messageNo: 2049, publish: fail, queue: 1350, heap: 55200, maxAlloc: 53236
WiFi disconnected
[ 24678][W][Packet.cpp:305] _allocate(): Packet buffer not allocated: low memory
[ 24679][E][MqttClient.cpp:161] publish(): Could not create PUBLISH packet
messageNo: 2050, publish: fail, queue: 1351, heap: 55320, maxAlloc: 53236
[ 24699][W][Packet.cpp:305] _allocate(): Packet buffer not allocated: low memory
[ 24699][E][MqttClient.cpp:161] publish(): Could not create PUBLISH packet
messageNo: 2051, publish: fail, queue: 1352, heap: 55256, maxAlloc: 53236
[ 24719][W][Packet.cpp:305] _allocate(): Packet buffer not allocated: low memory
[ 24719][E][MqttClient.cpp:161] publish(): Could not create PUBLISH packet
messageNo: 2052, publish: fail, queue: 1353, heap: 55192, maxAlloc: 53236
[ 24739][W][Packet.cpp:305] _allocate(): Packet buffer not allocated: low memory
[ 24739][E][MqttClient.cpp:161] publish(): Could not create PUBLISH packet
messageNo: 2053, publish: fail, queue: 1354, heap: 55128, maxAlloc: 53236
[ 24759][W][Packet.cpp:305] _allocate(): Packet buffer not allocated: low memory
[ 24759][E][MqttClient.cpp:161] publish(): Could not create PUBLISH packet
messageNo: 2054, publish: fail, queue: 1355, heap: 55064, maxAlloc: 53236
[ 24779][W][Packet.cpp:305] _allocate(): Packet buffer not allocated: low memory
[ 24779][E][MqttClient.cpp:161] publish(): Could not create PUBLISH packet
messageNo: 2055, publish: fail, queue: 1356, heap: 55000, maxAlloc: 53236
[ 24799][W][Packet.cpp:305] _allocate(): Packet buffer not allocated: low memory
[ 24800][E][MqttClient.cpp:161] publish(): Could not create PUBLISH packet
messageNo: 2056, publish: fail, queue: 1357, heap: 54936, maxAlloc: 53236
[ 24819][W][Packet.cpp:305] _allocate(): Packet buffer not allocated: low memory
[ 24820][E][MqttClient.cpp:161] publish(): Could not create PUBLISH packet
messageNo: 2057, publish: fail, queue: 1358, heap: 54872, maxAlloc: 53236
[ 24840][W][Packet.cpp:305] _allocate(): Packet buffer not allocated: low memory
[ 24840][E][MqttClient.cpp:161] publish(): Could not create PUBLISH packet
messageNo: 2058, publish: fail, queue: 1359, heap: 54808, maxAlloc: 53236
[ 24860][W][Packet.cpp:305] _allocate(): Packet buffer not allocated: low memory
[ 24860][E][MqttClient.cpp:161] publish(): Could not create PUBLISH packet
messageNo: 2059, publish: fail, queue: 1360, heap: 54744, maxAlloc: 53236
Connecting to MQTT...
Connecting failed.
Connecting to MQTT...
Connecting failed.
Connecting to MQTT...
Connecting failed.
Connecting to MQTT...
Connecting failed.
It's not reproducing here. Which version are you using? I've changed two things:
Both changes are in the PR.
Sorry, i've merged your first commit, but missed the the others. It's working now. Thank you very much.
I've changed one more bit: the CONNECT message will not check for low memory. Otherwise, as there is no means to delete from the queue, you will never be able to reconnect when low on memory.
Closed with pr #109
Feel free to reopen if the problem persists.
The feature discussion can be done in a separate issue/discussion.
I'm not sure if this is really a bug or i understand something wrong. In proddys emsesp-project some users reports freezing which is (maybe) related to the mqtt at low memory when mqtt is disconnected, With old asyncMqttClient we delete old queued messages when getFreeHeap goes below 60k. In espMqttClient there is the EMC_MIN_FREE_MEMORY setting. Discussion about that is here
I've tested with a modifcated https://github.com/bertmelis/espMqttClient/blob/main/examples/notask-esp32/notask-esp32.ino, and added in loop:
(BTW: currentMillies should not be static to update it in every loop)
config.h
helpers.h
(BTW: the mem-check for PSRAM is critical, the esp32 default in sdkconfig.h is CONFIG_SPIRAM_MALLOC_ALWAYSINTERNAL=4096, so mqtt publish is mainly allocated in internal ram. Better check both RAM for enough space)
With the above settings on a esp32/4M, nodemcu (no PSRAM) i get an output:
which looks as expected, except the allocation fail is at 53k,not at 60k. with debug messages i see that allocation fails at 60k, but publish gives back a valid message-id and free heap is still going down.
Some messages later strange things happens:
and at last, when heap is down at 53k the message-Id is returned as zero and no log messages from the client
Testing the same with a own task is similar, but some with some other events: log_with_own_task.txt
Workaround is not to use EMC_ALLOW_NOT_CONNECTED_PUBLISH, or stop feeding publishes on an own memory limit higher than the setting in the library. Your solution not to allocate mem for the new messages is not ideal, better is to delete the oldest message in the queue and queue the new message. Is there a way to delete the oldest message in queue by a function call? I've added a getQueueSize to check the queued messages, but in case of disconnected mqtt memory is allocated but the outbox does not grow. Is there a better way to read the number of messages in the queue?