Closed SimonMacIntyre closed 4 months ago
I think I found the issue. Whether or not this is a 'bug', or a misunderstanding of assigning client IDs, yet to be determined 😅
The problem: The stored check for client ID in inherit session uses the client's ID they define. The one in the existing client store, uses the server-assigned client identifier.
So I think this happens:
mqtt-1 | time=2024-07-19T14:18:53.411Z level=DEBUG msg="client disconnected" error="websocket: close 1006 (abnormal closure): unexpected EOF" client=admin/77768965-05bf-4f1f-91ed-44cb83b915e7 remote=192.168.156.4:53692 listener=ws
mqtt-1 | time=2024-07-19T14:18:53.413Z level=INFO msg="client disconnected" hook=ConnectionStatusHook client=admin/77768965-05bf-4f1f-91ed-44cb83b915e7 expire=true error="websocket: close 1006 (abnormal closure): unexpected EOF"
mqtt-1 | time=2024-07-19T14:18:53.426Z level=WARN msg="" listener=ws error="websocket: close 1006 (abnormal closure): unexpected EOF"
mqtt-1 | time=2024-07-19T14:18:54.125Z level=DEBUG msg="client disconnected" error="websocket: close 1006 (abnormal closure): unexpected EOF" client=admin/77768965-05bf-4f1f-91ed-44cb83b915e7 remote=192.168.156.4:49364 listener=ws
mqtt-1 | time=2024-07-19T14:18:54.125Z level=INFO msg="client disconnected" hook=ConnectionStatusHook client=admin/77768965-05bf-4f1f-91ed-44cb83b915e7 expire=true error="websocket: close 1006 (abnormal closure): unexpected EOF"
mqtt-1 | time=2024-07-19T14:18:54.129Z level=WARN msg="" listener=ws error="websocket: close 1006 (abnormal closure): unexpected EOF"
I will work on a PR that does this: If client has an assigned client identifier, use that to look up existing clients, instead of the client-defined client id.
In the meantime let me know if there is a more appropriate fix.
I have a confirmed fix by just using Better to detect the assigned identifier , not rely on cl.idcl.ID
which has the correct assigned, instead of the pk.Connect.ClientIdentifier
.
Will open a PR shortly.
I realized a few reasons why this may not have manifested to others before:
OnConnectAuthenticate
hook, manual client assignment is likely not too wide spread (but crucial for our workflow)@SimonMacIntyre:
@thedevop
pk.Connect.ClientIdentifier
is empty yes, which satisfies MQTT-3.2.2-16
! But now you are making me wonder if I should swap the PR back to the original implementation before I changed it, which is just using cl.ID
since it should be correct in all cases. AKA, it will be the client supplied normally, and the server assigned if client does not supply one. (Maybe this is what you were implying entirely in 2.
!)
Rather than do the If check on the cl.Properties.Props.AssignedClientId
, just reference cl.ID
everytimg, I Think that makes sense!
Edit I updated https://github.com/mochi-mqtt/server/pull/417 and its description
I feel using cl.ID in inheritClientSession is reasonable. However, this is mutable, perhaps that's the reason @mochi-co used the pk.Connect.ClientIdentifier instead. Let's hear from him if he's ok to change that to use cl.ID.
@thedevop @SimonMacIntyre I think this is a good move. Originally I used pk.Connect.ClientIdentifier
because it was immutable, however I think it's safe to say our use case has changed significantly since then. Works for me 👍🏻
I am still debugging and investigating this, so hopefully it is not human error. Figured I would post in case anyone else is experiencing it or able to easily reproduce it.
Happens to me on versions
> 2.6.0
, I did not test lower than that because my code is not compatible without some changes, so I decided to try to investigate the issue (or my misconfiguration).I am using protocol version 5 clients (mqttjs, specifically).
Here is my server config:
Here is my client settings:
Here is a screenshot of both my client simultaneously connected, and not being forcefully disconnected (also no client takeover occurred according to my logs.
Now if I publish a message, only one of them gets it (the latest to connect I think). Now this does seem like a client takeover.
So the 2 issues are:
Here is the code I am using to do that which used to work I think in older versions:
On the metrics page I can see:
Which is my 2 clients. Since its not being disconnected I guess is why the client stop isn't being detected. What's also interesting is the
subscriptions
count is actually only1
. Also when I ctrl+c the 2 terminals with the connections, I get the debug log of each disconnecting (and using the same client id).So the root of the issue I think is the lack of the disconnect.
I'll try to see if I can find a fix or where the issue is!