Closed xeanhort closed 3 months ago
Read paho.mqtt.golang@v1.4.2/packets/packets.go:267
is:
func (fh *FixedHeader) pack() bytes.Buffer {
var header bytes.Buffer
header.WriteByte(fh.MessageType<<4 | boolToByte(fh.Dup)<<3 | fh.Qos<<1 | boolToByte(fh.Retain)) // Line 267
header.Write(encodeLength(fh.RemainingLength))
return header
}
Previous write is: paho.mqtt.golang@v1.4.2/client.go:1087
is:
if p.Qos != 0 { // spec: The DUP flag MUST be set to 0 for all QoS 0 messages
p.Dup = true // Previous write HERE
}
So it looks like this issue relates to the way the Dup
flag on the publish
packet is being resent upon reconnection. I'm guessing that you are using the memory store (the default)?
What I think is happening (difficult to be sure without logs) is:
Publish
is called; it checks c.status.ConnectionStatus()
and sends message to c.obound
(which blocks because other messages are being processed)resume
we attempt to resend the publish packet.OutgoingComms
goroutine attempts to send the Publish
(this will fail because the old connection is down)If I'm correct then this is not a big concern; the OutgoingComms routine remains running until it's emptied all of it's input channels (this is deliberate to avoid deadlocks) but it's attempting to send to a closed connection (so will just get an error every time). As such the packet with questionable integrity is effectively being thrown away anyway. Generally I'd expect the OutgoingComms
goroutine to exit quickly when the connection has dropped (establishing a new connection should take longer); I'm not really sure why its taking longer in your case. Potentially I could see it being due to:
Publish
from within a message handler (this will block things up leading to unpredictable results; subject to the options in use).Without the ability to replicate this (and/or logs) it's going to be difficult to fix. For historical reasons the library spins up quite a few goroutines and the interactions between them is complex (so it's easy to create new issues when fixing something like this).
One option would be to modify Publish
such that it fails more quickly when the connection drops. Unfortunately as I'm not 100% sure if I'm right about the cause this may, or may not, address the issue!
Thank you for your feedback.
The connection is unstable but I saw this message a couple of times. I'll add the log handlers and try to reproduce it.
I'm going to close this as it's been idle for quite some time (so I suspect the original issue has been found/resolved).
I'm using version 1.4.2 and I sporadically had a data race detection when using the package. As it's only related to the internal functions of this library, I guess it shouldn't be related to my code: