eclipse-paho / paho.golang

Go libraries
Other
345 stars 94 forks source link

Disconnect method triggers OnClientError callback with net.ErrClosed error #254

Open robert197 opened 6 months ago

robert197 commented 6 months ago

When the Disconnect method is invoked with a specific reason code, the OnClientError callback is unexpectedly triggered with a net.ErrClosed error, even though the Disconnect operation itself returns nil. This suggests that the disconnection is being incorrectly treated as an error, potentially indicating an issue with how the library handles socket closures on client-initiated disconnects.

To reproduce

Configure the client with callbacks for server disconnections and client errors:

c := paho.NewClient(paho.ClientConfig{
    Router: router,
    Conn:   conn,
    OnServerDisconnect: func(d *paho.Disconnect) {
        fmt.Println("Server has been disconnected...")
    },
    OnClientError: func(err error) {
        fmt.Printf("Error: %s\n", err.Error())
    },
})

Disconnect the client using the Disconnect method with an administrative action reason code:

err := c.Disconnect(&paho.Disconnect{
    ReasonCode: byte(152), // Administrative Action
})

Expected behaviour:

The expected behavior is for the client to disconnect without invoking the OnClientError callback, as the disconnection is user-initiated and should not be treated as an error condition.

Software used:

MQTT Broker version: emqx:5.3.2 Client version: github.com/eclipse/paho.golang v0.10.1-0.20220826012857-d63b3b28d25f

Additional context

Currently, I am using a workaround that checks for net.ErrClosed in the OnClientError callback to differentiate between genuine errors and this disconnection handling issue. Here is the temporary code adjustment:

OnClientError: func(err error) {
    if errors.Is(err, net.ErrClosed) {
        fmt.Println("[MQTT] disconnection has been received")
    } else {
        panic(fmt.Sprintf("[MQTT] killing service because of unknown client error: %s", err.Error()))
    }
}

It seems there might be an issue with how the socket closure is managed during the disconnection process, perhaps attempting to close an already closed connection.

MattBrittan commented 6 months ago

Client version: github.com/eclipse/paho.golang v0.10.1-0.20220826012857-d63b3b28d25f

That version is two years old; please try with master (or, at a minimum, v0.21). I know I've made a range of changes to the error handling in the interim.

robert197 commented 6 months ago

@MattBrittan Thanks for the info, I have updated the version to v0.21 but the behavior is still the same.

MattBrittan commented 6 months ago

Thanks. The changes I was thinking of were in autopaho - see the errorHandler, this ensures only a single error is passed through (paho sometimes raises multiple errors when the connection is lost).

paho is intended to be a fairly low level library, and commonly returns multiple errors (as handled in by the above autopaho code). I'm not really sure whether to class this as a bug, as I believe that the way this works was deliberate (the error handling in paho.mqtt.golang led to quite a few deadlocks over the years, I'd guess this was a reaction to that). The documentation should definitely mention this...

MattBrittan commented 5 days ago

I have added comments in paho to warn users that OnClientError may be called multiple times (and may be called following Disconnect). Currently I feel that this is probably all that is needed - implementing another solution would add complexity to paho that is probably not needed (autopaho demonstrates how thic can be worked with).

Will leave this open for a while in case anyone has any suggestions on a better way of dealing with this (and feels it would be beneficial to do so).