Open canique opened 3 years ago
I have done another test with a longer offline period (multiple hours). One exemplary topic [with no publishes during that offline time (or in the last weeks)] has been transmitted via the bridge 272 times. If I did another test, with 0 publishes on that topic, I'd see dozens of transmissions again. I know that QoS 1 only guarantees at least 1 transmission and does not make any guarantees about an upper boundary, but 272 times for a single topic does sound like a bug to me...
p.s.: Even changing both the bridge config as well as the publishing client to QoS 2 does not fix the problem. With QoS 2 - due to this bug - every publish on a topic is sent twice by the bridge after a reconnect. For topics that have rare publishes the same msg is sent over and over again. So if I publish msg A, the remote broker receives A, A. If I publish B to another topic less frequently, it seems that the remote broker receives the same B repeatedly with every publish to A's topic. This behaviour is a clear violation to the MQTT specification.
p.p.s: If MQTTv5 is used for the bridge connection, the mosquitto bridge sends duplicates infinitely. The other end closes the MQTT bridge connection with the error "Sent too many concurrent PUBLISH messages." - so mosquitto sends too fast in that case?
A possible explanation for this behaviour: Every time a message is published, the log records "Bridge local.etj-55555550.xxx-gw1-br doing local SUBSCRIBE on topic sensors/#" - so the bridge is subscribing to a topic. If these subscriptions are cumulative, then for every subscription 1 retained msg will be sent at least upon successful connection establishment.
Might be related to https://github.com/eclipse/mosquitto/issues/2165, https://github.com/eclipse/mosquitto/issues/1467 So this issue seems to be open since 2019...
When testing with Mosquitto v2.0.12 I noticed something interesting:
Test setup: Mosquitto 2.0.12 debian buster HiveMQ Remote Broker (Mosquitto sending out msgs to HiveMQ)
Procedure: Shut down HiveMQ for ~1 minute Count the number of outgoing retained messages on a topic where no publishes have occured
With the Debian Buster installation package, there are ~8 publishes from Mosquitto -> HiveMQ on the test topic although there haven't been any msgs published recently on that topic.
When I compile mosquitto myself, running the same test, there is 1 publish from Mosquitto -> HiveMQ on the test topic which is perfectly fine because it is a topic with a retained message.
Conclusion: The bug only occurs with the Debian Buster package (or maybe with all releases) but not if compiled with different settings/libs.
When compiling I run
make WITH_CJSON=no WITH_DOCS=no WITH_WEBSOCKETS=yes WITH_SYSTEMD=yes
On the test system I have installed:
libwebsockets12 (2.4.2)
libwebsockets-dev (2.4.2)
libwebsockets18 (4.2.1)
The mosquitto binary file from the debian package is ~180 KB, my self compiled binary is ~1.9 MB.
If I change the compile command to
make WITH_CJSON=no WITH_DOCS=no WITH_WEBSOCKETS=yes WITH_SYSTEMD=yes WITH_WRAP=yes WITH_STRIP=no
the bug still does not appear.
I think I've spotted the cause:
it seems to be WITH_ADNS=yes
- this option causes retransmits.
I see the same problem, I use mosquitto 2.0.11 on Windows, I use the windows installer.
It's a big problem for our system, because the server receiving the mqtt messages get overloaded. and we have to send a service technician to the site to restart it :(
Any solution for this?
I've described the solution already:
Compile Mosquitto without WITH_ADNS
option. If the option is enabled, it breaks Mosquitto.
It looks like this option is turned off by default when building with CMake for windows. can anyone confirm that mosquitto-2.0.12-install-windows-x64.exe is built WITH_ ADNS = OFF?
We're experiencing the same issue at a fairly large scale. Using 2.0.14, compiled from source, with WITH_ADNS=yes
.
We were scratching our heads big time about what's actually going on, but this issue describes the situation perfectly.
Here's the test procedure: 1) A few msgs are published locally to the mosquitto broker with retain flag set, on 1-2 topics. The broker has a bridge configured in OUT mode. The client publishing the msgs has set cleansession=false (weeks ago it was "true"). 2) When I pull the network cable from the linux machine running the broker, the bridge starts to enqueue msgs. This is perfectly fine. 3) After a couple of minutes, after plugging in the network cable again, the bridge sends out msgs from its queue. The problem is, it sends msgs from topics which have not been recently written at all. And it sends them DOZENs of times. I have counted 180 transmissions, although there are only 4 topics, and at maximum ~40 msgs could have been published in total while offline. The reconnect in the attached log file happens @ 1625652526. For example, the topic c32545676/sensors/air/92/wakeup_interval is not being actively used. As it has a retained msg, the msg should go out once on reconnect. But it is transmitted 45 times (!). The longer the offline time, the more often the msg is transmitted.
Tested on these mosquitto versions: 1.6.12, 2.0.11 Linux kernel: 4.19.59
excerpt from config file: