homieiot / homie-esp8266

💡 ESP8266 framework for Homie, a lightweight MQTT convention for the IoT
http://homieiot.github.io/homie-esp8266
MIT License
1.36k stars 308 forks source link

Question: Delay between MQTT messages #724

Closed sude22 closed 2 years ago

sude22 commented 2 years ago

Hello,

i wonder if anybody have the same "problem": With the latest async-mqtt-client used by Homie there is a delay between every send MQTT message.

I already asked in the async-mqtt-client GIT, and Bert answered that it is the current implentation...He wants to optimize it but is currently out of time. The outgoing messages are queued and for QOS>0 the client waits for the acknowledge before sending the next one.

So I have a delay of 1s between every outgoing message, you can see it e.g. during startup. My device has 4 inputs, if they are toggled at once, the last message is send after 4 seconds.

I have a IOT-Stack Docker (with mosquitto) installed on a RPI4.

If anybody have an idea or suggestion how to avoid that delays I would be glad.

kleini commented 2 years ago

I am not looking at the current implementation of async-mqtt-client, but having that queue is a great improvement. Before that queue was implemented, messages not fitting into the TCP send buffer have just been dropped and have not been sent at all. Devices with larger amounts of properties did not work at all. If the queue now waits for the acknowledge of sent messages, then 1 second for this acknowledgement sounds very large to me. I am running mosquitto in Docker, too, but in x86-64 hardware. Acknowledgements from this mosquitto are much faster than 1 second. So, I guess your problem is not really the homie-esp8266 or async-mqtt-client's queue but the slow response of your MQTT broker.

sude22 commented 2 years ago

OK, since I changed nothing to the default IOTStack installation I have a closer look what might be the problem with Mosquitto. I'm glad it is not a general problem.

bertmelis commented 2 years ago

What do you mean with "not a general problem"? Is it just one device, or a couple? Perhaps bad WiFi reception?

sude22 commented 2 years ago

Sorry, that was not quite clear. I mean "a general homie/asyc-mqtt-client problem that affects everyone"

I tested now with different devices, different homie-projects, all show the same behavior. My WiFi is fine, and the "older" devices in my home automation (not updated to latest homie) do not show that behavior.

As said, it is not a real showstopper for me. But I was curious if someone else has seen that behavior.

By now I don't know how to optimize the mosquitto broker, I turned off persistance with no effect. So for now I will live with it.

bertmelis commented 2 years ago

Persistance is not the problem. Maybe setting tcp_nodelay to true will help:

set_tcp_nodelay [ true | false ]
If set to true, the TCP_NODELAY option will be set on client sockets to disable Nagle's algorithm. This has the effect of reducing latency of some messages at potentially increasing the number of TCP packets being sent. Defaults to false.

This option applies globally.

Reloaded on reload signal.

Especially for the Ack messages: they are small and the TCP stack might queue them, or in case no further data is coming, sent after (typically) 500msec.

sude22 commented 2 years ago

I do not see a change with nodelay set to true.

https://user-images.githubusercontent.com/3779162/144025913-80342af0-e7ab-42bf-9680-6f6618702bf8.mp4

luebbe commented 2 years ago

I have several devices using the homie framework which send more than 20 messages on startup. None of them shows the behaviour of yours. This really looks like the 500ms ack delay is hitting you. Or you have a 500ms delay somewhere in your code.

sude22 commented 2 years ago

OK, I created a test-project, doing nothing except the homie setup. It stays the same.... Maybe google has an answer to the 500ms ACK delay issue.

bertmelis commented 2 years ago

I did see the same behaviour. It doesn't surprise me.

Goal is still to speed up sending: "more than 1 message in-the-air".

luebbe commented 2 years ago

Ok, I'll double-check. Maybe I was wrong. I noticed that since the update to asyncmqtt it takes longer to send a bunch of messages but I never had the impression that there was a 500ms delay between messages.

sude22 commented 2 years ago

BTW, the Mosquitto version is 2.0.14, in case that matters.

sude22 commented 2 years ago

It turns out that this could be a problem in daily life. In case I press a button while the stats are published in worst case it could take a few seconds for the message to be published. That is to slow e.g. to turn a light on. Maybe there are some ideas to solve the problem out there...

sude22 commented 2 years ago

OK, finally I managed to see what is going on at TCP-level. Homie_Msg_delay But I don't understand what exacly is going on..... 1) the mqtt message is send from esp to mosquitto. 2) mosquitto sends the mqtt ACK (no delay) 3) homie sends a TCP-ACK with a notable delay (~300ms) 4) direct after that, homie sends the next mqtt message.

So the delay is not happing on the mosquitto broker, but in homie. Any ideas about that are highly appreciated.

kleini commented 2 years ago

Is it possible to have a look on your code?

sude22 commented 2 years ago

Yes, but the test project contains nothing but the homie setup and loop, and one advertised node. Homie, AsyncMqtt and ESPAsyncTCP are from latest github commits.

include

include

define FW_VERSION "0.0.1"

define FW_NAME "homietest"

define DEBUGNODE_TYPE "debug"

define DEBUGMESSAGE "debug_message"

HomieNode debugNode(DEBUGNODE_TYPE, DEBUGNODE_TYPE, DEBUGNODE_TYPE);

void setup() { Serial.begin(115200);
Homie_setBrand("IoT"); // before Homie.setup() Homie_setFirmware(FW_NAME, FW_VERSION); // The underscore is not a typo! See Magic bytes Homie.disableLedFeedback(); // before Homie.setup() debugNode.advertise(DEBUGMESSAGE);
Homie.setup();
}

void loop() {
Homie.loop();
}

Maybe the "Delayed ACK"-issue is located in the used ESPAsyncTCP/lwip stack. As far as I can see there is the option to disable "Nagle's algorithm" (which is done in AsyncMqttClient via _client.setNoDelay(true);) But there seems to be no way to disable the delayed ACK...

sude22 commented 2 years ago

OK, after hours of reading about LWIP etc. I was not able to track that problem down. The workaround for me to get a stable environment is to set the async-mqtt-client back to commit 7f1ba481a22d56 (before the queue was introduced). I know there has been a lot of bugfixes since than, but I'm not using TLS or anything, and with that version I don't have any problems... I can close this issue, it will be fixed when the async-mqtt-client will not causing that delay anymore.