mochi-mqtt / server

The fully compliant, embeddable high-performance Go MQTT v5 server for IoT, smarthome, and pubsub
MIT License
1.3k stars 223 forks source link

Client supplied ID is used to look up existing clients in inheritSession, while the store of existing actually uses the server assigned if one exists. #416

Closed SimonMacIntyre closed 4 months ago

SimonMacIntyre commented 4 months ago

I am still debugging and investigating this, so hopefully it is not human error. Figured I would post in case anyone else is experiencing it or able to easily reproduce it.

Happens to me on versions > 2.6.0, I did not test lower than that because my code is not compatible without some changes, so I decided to try to investigate the issue (or my misconfiguration).

I am using protocol version 5 clients (mqttjs, specifically).

Here is my server config:

server := mqtt.New(&mqtt.Options{
        InlineClient: true,
        Capabilities: &mqtt.Capabilities{
            MaximumClients:               math.MaxUint16,
            MaximumSessionExpiryInterval: 0,              // maximum number of seconds to keep disconnected sessions
            MaximumMessageExpiryInterval: 60 * 60 * 24,   // maximum message expiry if message expiry is 0 or over
            ReceiveMaximum:               1024,           // maximum number of concurrent qos messages per client
            MaximumQos:                   1,              // maxmimum qos value available to clients
            RetainAvailable:              1,              // retain messages is available
            MaximumPacketSize:            0,              // no maximum packet size
            TopicAliasMaximum:            math.MaxUint16, // maximum topic alias value
            WildcardSubAvailable:         1,              // wildcard subscriptions are available
            SubIDAvailable:               1,              // subscription identifiers are available
            SharedSubAvailable:           1,              // shared subscriptions are available
            MinimumProtocolVersion:       3,              // minimum supported mqtt version (3.0.0)
            MaximumClientWritesPending:   1024 * 8,
        },
    })

Here is my client settings:

 keepalive: 60,
  reconnectPeriod: 5000,
  clean: false, //I've tried true as well, and with no assigned client id below
  clientId: 'bob',
  debug: true,
  rejectUnauthorized: false,

Here is a screenshot of both my client simultaneously connected, and not being forcefully disconnected (also no client takeover occurred according to my logs.

image

Now if I publish a message, only one of them gets it (the latest to connect I think). Now this does seem like a client takeover.

So the 2 issues are:

Here is the code I am using to do that which used to work I think in older versions:

func (h *ConnectionStatusHook) OnDisconnect(client *mqtt.Client, err error, expire bool) {
    if client.StopCause() == packets.ErrSessionTakenOver {
        logErr := h.LogConnectionState(client.ID, CLIENT_TAKEOVER)
        if logErr != nil {
            sentry.CaptureException(err)
            h.Log.Error("Error logging client takeover", "error", err)
        }
        return
    }

On the metrics page I can see:

{
"version": "2.6.4",
"started": 1721337314,
"time": 1721337637,
"uptime": 323,
"bytes_received": 322,
"bytes_sent": 442,
"clients_connected": 2,
"clients_disconnected": 0,
"clients_maximum": 3,
"clients_total": 2,
"messages_received": 2,
"messages_sent": 3,
"messages_dropped": 0,
"retained": 0,
"inflight": 0,
"inflight_dropped": 0,
"subscriptions": 1,
"packets_received": 22,
"packets_sent": 22,
"memory_alloc": 3694592,
"threads": 15
}

Which is my 2 clients. Since its not being disconnected I guess is why the client stop isn't being detected. What's also interesting is the subscriptions count is actually only 1. Also when I ctrl+c the 2 terminals with the connections, I get the debug log of each disconnecting (and using the same client id).

So the root of the issue I think is the lack of the disconnect.

I'll try to see if I can find a fix or where the issue is!

SimonMacIntyre commented 4 months ago

I think I found the issue. Whether or not this is a 'bug', or a misunderstanding of assigning client IDs, yet to be determined 😅

The problem: The stored check for client ID in inherit session uses the client's ID they define. The one in the existing client store, uses the server-assigned client identifier.

So I think this happens:

mqtt-1  | time=2024-07-19T14:18:53.411Z level=DEBUG msg="client disconnected" error="websocket: close 1006 (abnormal closure): unexpected EOF" client=admin/77768965-05bf-4f1f-91ed-44cb83b915e7 remote=192.168.156.4:53692 listener=ws
mqtt-1  | time=2024-07-19T14:18:53.413Z level=INFO msg="client disconnected" hook=ConnectionStatusHook client=admin/77768965-05bf-4f1f-91ed-44cb83b915e7 expire=true error="websocket: close 1006 (abnormal closure): unexpected EOF"
mqtt-1  | time=2024-07-19T14:18:53.426Z level=WARN msg="" listener=ws error="websocket: close 1006 (abnormal closure): unexpected EOF"
mqtt-1  | time=2024-07-19T14:18:54.125Z level=DEBUG msg="client disconnected" error="websocket: close 1006 (abnormal closure): unexpected EOF" client=admin/77768965-05bf-4f1f-91ed-44cb83b915e7 remote=192.168.156.4:49364 listener=ws
mqtt-1  | time=2024-07-19T14:18:54.125Z level=INFO msg="client disconnected" hook=ConnectionStatusHook client=admin/77768965-05bf-4f1f-91ed-44cb83b915e7 expire=true error="websocket: close 1006 (abnormal closure): unexpected EOF"
mqtt-1  | time=2024-07-19T14:18:54.129Z level=WARN msg="" listener=ws error="websocket: close 1006 (abnormal closure): unexpected EOF"
SimonMacIntyre commented 4 months ago

I will work on a PR that does this: If client has an assigned client identifier, use that to look up existing clients, instead of the client-defined client id.

In the meantime let me know if there is a more appropriate fix.

SimonMacIntyre commented 4 months ago

I have a confirmed fix by just using cl.ID which has the correct assigned, instead of the pk.Connect.ClientIdentifier. Better to detect the assigned identifier , not rely on cl.id

Will open a PR shortly.

SimonMacIntyre commented 4 months ago

I realized a few reasons why this may not have manifested to others before:

thedevop commented 4 months ago

@SimonMacIntyre:

  1. Is cl.Properties.Props.AssignedClientID only re-assigned when it is not empty (only occurs when client ID was empty in Connect)? If so, then #417 seems to be fine. Otherwise, it would be a protocol violation.
  2. Given cl.Properties.Props.AssignedClientID is manually re-assigned, cl.ID should also be changed as there are many references based on cl.ID.
SimonMacIntyre commented 4 months ago

@thedevop

  1. It is only assigned when pk.Connect.ClientIdentifier is empty yes, which satisfies MQTT-3.2.2-16!
  2. You're right, mochi is already doing that: image

But now you are making me wonder if I should swap the PR back to the original implementation before I changed it, which is just using cl.ID since it should be correct in all cases. AKA, it will be the client supplied normally, and the server assigned if client does not supply one. (Maybe this is what you were implying entirely in 2.!)

Rather than do the If check on the cl.Properties.Props.AssignedClientId, just reference cl.ID everytimg, I Think that makes sense!

Edit I updated https://github.com/mochi-mqtt/server/pull/417 and its description

thedevop commented 4 months ago

I feel using cl.ID in inheritClientSession is reasonable. However, this is mutable, perhaps that's the reason @mochi-co used the pk.Connect.ClientIdentifier instead. Let's hear from him if he's ok to change that to use cl.ID.

mochi-co commented 4 months ago

@thedevop @SimonMacIntyre I think this is a good move. Originally I used pk.Connect.ClientIdentifier because it was immutable, however I think it's safe to say our use case has changed significantly since then. Works for me 👍🏻