eclipse / paho.mqtt.golang

Other
2.73k stars 533 forks source link

Client not receiving message from other client #637

Closed Johnny2210 closed 1 year ago

Johnny2210 commented 1 year ago

Hello, I am currently facing a kind of weird issue with my MQTT setup.

It's used on an embedded device running multiple systemd services, which use mqtt clients to publish and receive information. These clients all connect to a mosquitto broker (version 1.6.10).

When deploying changes to the embedded device, these systemd services get restarted, which disconnects the clients and reconnects them (The mosquitto service is not restarted!). After a few subsequent deployments, one of the clients (let's call it A) stops receiving messages of another (B). The weird part is, if publishing manually on the same topic A subscribed to (with mosquitto_pub) , the message is received. Also when subscribing to the topic on which B is publishing (with mosquitto_sub), the messages are received as expected. So basically, client A is not receiving any messages client B is publishing, but from other (command line) clients.

The Go library version v1.4.2 Mosquitto version 1.6.10 Running on a RaspberryPi Cm3 The clients are initialized with following options:

opts.SetOrderMatters(false)
opts.SetCleanSession(false)
opts.SetKeepAlive(10 * time.Second)
opts.SetPingTimeout(1 * time.Second)
opts.SetAutoReconnect(true)
opts.SetConnectTimeout(15 * time.Second)

The logs of mosquitto show no disconnection of any of these clients. Something which might be useful information is that it seems like setting CleanSession to true for all clients helps, but unfortunately that is not an option, since it would break too much other stuff.

If you have any idea what might cause this, it would be much appreciated if you'll let me know! Thanks for any help in advance.

MattBrittan commented 1 year ago

Please refer to the readme for the best places to seek support (the issues section is intended for bugs with sufficient supporting info to be actionable - you will get a much wider audience at another venue). Unfortunately there is not enough info in your question to say much (this kind of issue can be very difficult to trace and you need the broker logs at a minimum). If I understand the symptoms you describe then this looks like a Mosquitto issue. MQTT messages are not tied to the sender so the source of the message should be irrelevant to A (so it's not receiving messages sent by B but is receiving them when sent by mosquitto_pub then this may be a broker issue). As I don't have your code this is a guess (your code may be receiving the message but failing to process it for whatever reason).
I'd start with the broker logs (upgrading to a newer version of Mosquitto may also be worthwhile - issues like this may be relevant).

Johnny2210 commented 1 year ago

Thank you for the quick response! I will post my issue on one of the provided sites as well, but just wanted to follow up real quick on your suggestions. Upgrading to mosquitto version 1.6.12 unfortunately did not resolve the issue, but I noticed some strange behavior in the logs. In some instances the broker publishes to the client before he even subscribed. Could this somehow block any further messages on the same topics? example-topic-1 is subscribed with QoS of 1.

1678786214: No will message specified.
1678786214: Sending CONNACK to telemetry (1, 0)
1678786214: Sending PUBLISH to telemetry (d1, q1, r0, m722, 'example-topic-1/a', ... (109 bytes))
1678786214: Sending PUBLISH to telemetry (d1, q1, r0, m723, 'example-topic-1/b', ... (118 bytes))
1678786214: Sending PUBLISH to telemetry (d1, q1, r0, m904, 'example-topic-1/a', ... (109 bytes))
1678786214: Sending PUBLISH to telemetry (d1, q1, r0, m1085, 'example-topic-1/a', ... (109 bytes))
1678786214: Sending PUBLISH to telemetry (d1, q1, r0, m1086, 'example-topic-1/b', ... (118 bytes))
1678786214: Sending PUBLISH to telemetry (d1, q1, r0, m1267, 'example-topic-1/a', ... (109 bytes))
...
...
1678786214: Sending PUBLISH to telemetry (d0, q1, r0, m2293, 'example-topic-1/c'1, ... (119 bytes))
1678786214: Received PUBLISH from telemetry (d0, q2, r1, m1, 'example-topic-2', ... (7 bytes))
1678786214: Sending PUBREC to telemetry (m1, rc0)
1678786214: Received PUBLISH from telemetry (d0, q2, r1, m2, 'example-topic-1/a', ... (109 bytes))
1678786214: Sending PUBREC to telemetry (m2, rc0)
1678786214: Received PUBREL from telemetry (Mid: 1)
1678786214: Sending PUBCOMP to telemetry (m1)
1678786214: Received PUBREL from telemetry (Mid: 2)
1678786214: Sending PUBCOMP to telemetry (m2)
1678786214: Received SUBSCRIBE from telemetry
1678786214:         shutdown (QoS 2)
1678786214: telemetry 2 shutdown
1678786214: Sending SUBACK to telemetry
1678786214: Received SUBSCRIBE from telemetry
1678786214:         example-topic-1/# (QoS 1)
1678786214: telemetry 1 example-topic-1/#
1678786214: Sending SUBACK to telemetry

From the pkg.go.dev documentation:

When QOS1+ subscriptions have been created previously and you connect with CleanSession set to false it is possible that the broker will deliver retained messages before Subscribe can be called. To process these messages either configure a handler with AddRoute or set a DefaultPublishHandler.

MattBrittan commented 1 year ago

In some instances the broker publishes to the client before he even subscribed.

This is normal (the subscriptions form part of the session and are resumed when you reconnect). As you note Ive added suggestions for how to process these messages in the docs.

Without a lot more info it's not possible to help (the issue may we'll be in your code - deadlocks are a common cause).

MattBrittan commented 1 year ago

Copying my answer to your stackoverflow question. I have added a note to the Readme highlighting this issue.


Thanks for the logs; the fact that Mosquitto is not receiving a PUBACK for the messages received immediately after the connection is established led me to the likely cause.

With Mosquitto v1.6.x max_inflight_messages defaults to 10; so after 10 unacknowledged messages Mosquitto will not send any more messages. This is why it stops sending to clientA.

paho.mqtt.golang will not acknowledge a message if there is no handler (if you enable logging it will output a warning when this happens). The rationale for this is lost in the mists of time (I added the warning) but I suspect it's because, without a handler, the message cannot be said to have been processed (so should not be acknowledged). Earlier versions of Mosquitto used to resend messages that had not been acknowledged, but this is no longer the case (and prohibited in the v5 spec), meaning that it's effectively a permanent block.

In your case these two factors are combining; you connect, receive 10 PUBLISH packets, and then subscribe (setting up the handlers) but by that point Mosquitto has 10 messages in flight and will not send any more.

The fix is to add:

opts.SetDefaultPublishHandler(func(mqtt.Client, mqtt.Message) {})

This will add a default publish handler (that ignores the messages); the fact that the handler exists means that PUBLISH packets will be acknowledged.

Does ClientA need messages received whilst offline (it's unsubscribing from telemetry/+ so will not get those anyway). If not then using opts.SetCleanSession(true) is another way of avoiding this issue.

If you do need to process the messages then use AddRoute to configure your message handlers before you connect (I generally have a catch-all DefaultPublishHandler that just logs the message so I can see that something has been missed).

Johnny2210 commented 1 year ago

Thank you very much for your help!