Closed wagner-rebello closed 3 years ago
OK, there are 2 issues here, the client has never taken any action if the broker doesn't respond to its ping request if the connection status is connected, it doesn't disconnect for example. However, just because no one has asked for this thus far it doesn't mean we can't do this, I can add a setting to to keep alive processing that will disconnect if pings are not responded to, that way users can select the behaviour they want.
The second issue, as in why your callbacks are called when you turn your devices on etc. is nothing to do with the client. The client uses the Dart runtime to interface to the network stack, if the runtime says a socket is connected then the client assumes it is, when the runtime says a socket is disconnected then again it takes appropriate action dependant on user settings. If you want to know why/how the Dart runtime decides these things then I suggest you bring this up with the Dart Google guys.
Using wireshark and filtering for TCP i can see that TCP retransmission keep happening until the device is turned on when the connection is reseted. A setting to disconnect if keep alive don't respond would be cool to solve that problem.
Yes, I'll update the client to allow the user to select whether a broker failing to respond to a ping will cause a disconnect or not and chose the time period they want to wait.
Client updated to add the ping request response fail functionality, package re published at version 9.4.0.
See the API docs and the example mqtt_server_client_keep_alive_disconnect.dart for usage.
Now at 9.4.2
When the connection is lost with the new keepalive disconnection option the MQTT status correctly enter into disconnected but the socket stay opened waiting for a RST packet from server.
Sometimes this leaves to a state that a new connection replaces the variable "client" in file \connectionhandling\mqtt_client_mqtt_connection_base.dart
leaving the old socket in a unknown state and closing the new created connection with RST packet from the old sock.
This sometimes put the connection in a loop where it creates a new connection overwrite the old client reference with a new reference, receive a RST or FIN from the old connection closes the new and go on.
I think that it's better to close the TCP socket when the keep alive fails with the option "enabled".
After some diggin' I found out that this chunk of code connectionhandling\server\mqtt_client_synchronous_mqtt_server_connection_handler.dart
Is leaving lost references to the old TCP socks, when the new connection is created the "MqttServerConnection" inheritances creates new references, if old socks listeners capture data (Like RST or FIN) the data is then redirected to the new object references.
The old socks stays de-referenced and got FIN (or RST because of TCP Retransmission) by the server by the lack of keep alive (or TCP Retransmission), the old listener put the state of connection in disconnected, my code call the connection method again that de-reference the actual connection and go on...
I hear what you are saying, the old socket events are somehow redirected to the closures(onDone, onError etc.) of the new socket. But what can the client actually do here? Dart is a GC based language, only when GC runs are unused objects destroyed, even doing client == null probably won't force a GC at that point. All the client can do is call methods on the old socket to destroy it somehow. What do you mean by ' with the option "enabled".
"What do you mean by ' with the option "enabled".
Set the property ("Option") disconnectOnNoResponsePeriod
>= 1
Ok, where is this in the Dart API? The SocketOption class only has tcpNoDelay, apologies if I'm missing something here.
Examples of the problem using wireshark. As you can see in the line Nº 287 there was a "ping req" (port 48420) from the smartphone without response (I intentionally removed my device (broker) from power). onDisconnect callback runs without problem informing my code that MQTT connection dropped. After 3 seconds my code try another connection (Nº 323)(device still offline and flutter sent retransmissions for the SYN packet).
When my device turned on again it ACKed the SYN packet and the TCP connection opened (48423 port).
After some time the old socket tryed another retransmission from the lost ping request (Nº 1461 from below image) (port 48420, first connection)
As the method _disconnect() from file \connectionhandling\mqtt_client_mqtt_connection_base.dart
never got called when there is a ping response fail, the old sock still exists (port 48420), including the old listeners.
The device RSTed the connection (Nº 1464) because it was powered off 'cause the connection don't exist, that RST call the onDisconnect callback and set the connection as disconnected also Ping Req from the new connection do not happen anymore (Set to 15s) when my device stays more than 15s without ping req it ends the connection as expected and put the smartphone in this eternal loop.
"What do you mean by ' with the option "enabled". Set the property ("Option")
disconnectOnNoResponsePeriod
>= 1
This is the new option that you created to "disconnect" when there is no ping respones after some time.
Ah OK, its not a socket option, so back to my original point, how do you suggest the client 'disposes' of old references to its sockets? I can't think of a way to do this.
BTW issue #302 shows the disconnectOnNoResponsePeriod opytion working for the use case the OP had, so it does work in some, hopefully most cases.
Ah OK, its not a socket option, so back to my original point, how do you suggest the client 'disposes' of old references to its sockets? I can't think of a way to do this.
Maybe some chunk of code that can be called on internalConnect
method that verify existance of connection variable before re-instantiation and some public method that destroy the socks.
I'm creating a clean project to test how the TCP listener behave after destroying sockets.
Yes, destroy may be an option here, depending on how it works, we dont want destroy to call one of the closures like onDone etc. Your work here is interesting, please post back your results, thanks.
After the tests i can confirm. Calling destroy on the old sock (client variable in your lib) stop all callbacks from listen. onData, onError listeners from the old sock are destroyed in conjunction. But the onDone is called when destroying the old client.
You can cancel the listener before destroying the socket, this way none of the callbacks run.
Yes I remember now, I actually use this mechanism in the browser ws comnnection -
...
closeEvents = client.onClose.listen((e) {
MqttLogger.log(
'MqttBrowserWsConnection::connect - websocket is closed');
closeEvents.cancel();
errorEvents.cancel();
return completer.complete(MqttClientConnectionStatus());
});
errorEvents = client.onError.listen((e) {
MqttLogger.log(
'MqttBrowserWsConnection::connect - websocket has erred');
closeEvents.cancel();
errorEvents.cancel();
return completer.complete(MqttClientConnectionStatus());
});
...
Why it never occurred to me to apply this to the rest of the connection handlers I don't know.
I'll update the package to add this functionality.
OK, I've updated the client with the fix above, I've not re published this yet as we may find more problems here.
If you would point your pubspec.yaml to the git repo and set the branch to 'issue299-1' and carry on with your testing that would be great.
I've found a small bug when using disconnect on no ping response with auto reconnect and fixed it, if you are currently testing with this could you please do a pub upgrade, thanks.
Sure, i'll test.
It keeps entering in loop when the connection is lost by keep alive.
I attached some wireshark packet capture.
I created a fork in my personal account and did some changes on 9.4.2, in my fork after receiving the RST packets from the OLD connection the new stays opened.
https://github.com/ZantsuRocks/mqtt_client/commit/8db8e3bc0866d2d0b3de4ee2f20cf5fb4e7cc589
I didn't really understand the structure of your code, so I didn't create a pull request. My code is not 100% implemented either, but it serves as a basis perhaps for a fix.
Ok, could you just confirm when you do a pub upgrade that the branch it prints is 'issue-299-1' with a commit of 'a42065814a8adf08e8b49b9443c1b06649b1df63'.
Ok, could you just confirm when you do a pub upgrade that the branch it prints is 'issue-299-1' with a commit of 'a42065814a8adf08e8b49b9443c1b06649b1df63'.
Yes
I'm trying to replicate the issue again using your branch to tell you the exact steps that create the problem. But now the problem isn't happening again.
Gotcha.
The problem is with the timing between the ping req
and the ping resp
Ping response delayed something about 120ms to arrive and the connection
died for your handlers.
This created a duplicate connection and the old still open.
OK, I'm not really following you, what do you mean by ' the connection died for your handlers'? What is causing this, do you have a log?
When you say 'This created a duplicate connection and the old still open.', we now cancel the listener and destroy the socket so how does this happen?
I'll put some more logging around the destroy method to see exactly when it is called.
Put some log inside _disconnect()
in the MqttConnectionBase
class.
You'll see that when keep alive fails (There is no Ping Response to a Ping Request) that method never runs.
At least for me this is happening.
I'm put some log in MqttConnectionBase
class in the method _disconnect()
(line 80)
https://gist.github.com/wagner-rebello/e906a6af8b9aded8ac3346907173ff9e
This is what happens to my log (the red arrow is the onDisconnect callback)
As you can see the log on MqttConnectionBase
never runs leaving the sock listener and connection "open" but turning your status (MqttClientConnectionStatus
) disconnected.
As the connection may still be open TCP Retransmission keep ocurring the old listeners still exists too, wich kills the new connection.
Maybe this log helps more
The Red Arrow is when i turned off the power from the broker device. (onDisconnect callback from ping without response)
Blue Arrow is the new connection (onConnect callback after calling again connect as the actual status (MqttClientConnectionStatus) is now MqttConnectionState.disconnected).
Green Arrow is the old socket getting a TCP RST (TCP keeps sending Ping Request retransmissions until it get a FIN or a RST from the broker).
That Green Arrow put the MqttClientConnectionStatus again in MqttConnectionState.disconnected even that the new socket still exists, and this is where the reconnection loop go on forever.
Pull Request #308 fixes the connection loop for me. The old socket stay alive when there is a ping fail but the listeners references at least will be destroyed.
Thanks for this, merged the change, please pub upgrade and re test, thanks.
I've added a small update, hopefully it should now close the socket before we re connect, please pub upgrade and see what you get now.
Sure, i'll test.
Changes were tested by 3 people together with me and we didn't find any more problems.
Thanks, I've merged the branch, finished my own testing and re published the package at version 9.5.0.
Thanks for the testing you have done here, I know how time consuming this can be, this helps you of course in your use case but the fixes identified in this issue help all the users which is great.
I'll close this issue here, its got rather lengthy, if you find anything else please raise a separate issue, we can always refer to this one if needed.
I'm using a list of connections to connect to many brokers in my local network (Each Broker is a device). When i unplug power from one device (to simulate a connection loss) none of my callbacks run and the connection status stays connected for that broker/device. Using wireshark i can see that ping never got a response. When i turn on the device again ALL the callbacks run at the same time that the connection is reestablished. (I have a timer that checks connection status, the status turn to disconnected at that moment)
As the numbers in the image below:
Codes for creating the connections:
Turning autoReconnect to true doesn't help.