eclipse / paho.mqtt.java

Eclipse Paho Java MQTT client library. Paho is an Eclipse IoT project.
https://eclipse.org/paho
Other
2.1k stars 880 forks source link

Client disconnects on date/time changes (paho dependent on monotonic time) #958

Open nunoOliveiraqwe opened 1 year ago

nunoOliveiraqwe commented 1 year ago

Client gets disconnected by the broker when a client system date/time change occurs. An exception is thrown by the client, with the following stack trace:

org.eclipse.paho.client.mqttv3.MqttException: Timed out waiting for a response from the server at org.eclipse.paho.client.mqttv3.internal.ExceptionHelper.createMqttException(ExceptionHelper.java:31) ~[org.eclipse.paho.client.mqttv3-1.2.5.jar:?] at org.eclipse.paho.client.mqttv3.internal.ClientState.checkForActivity(ClientState.java:747) ~[org.eclipse.paho.client.mqttv3-1.2.5.jar:?] at org.eclipse.paho.client.mqttv3.internal.ClientComms.checkForActivity(ClientComms.java:818) ~[org.eclipse.paho.client.mqttv3-1.2.5.jar:?] at org.eclipse.paho.client.mqttv3.internal.ClientComms.checkForActivity(ClientComms.java:804) ~[org.eclipse.paho.client.mqttv3-1.2.5.jar:?] at org.eclipse.paho.client.mqttv3.TimerPingSender$PingTask.run(TimerPingSender.java:79) ~[org.eclipse.paho.client.mqttv3-1.2.5.jar:?] at java.util.TimerThread.mainLoop(Unknown Source) ~[?:?] at java.util.TimerThread.run(Unknown Source) ~[?:?]

Looking at the offending code:

image

It's possible that when time changes the exception at line 747(checkActivity method) might be thrown if a ping from the broker is not received between the time change and the method call. This seems to be caused by the used of the HighResolutionTimer which is just a wrapper class around System.nanoTime(), which is not monotonic.

After the exception is thrown, the ping task will get cancelled, and the shutdown method will get called, if autoReconnect is set to true, it will try to reconnect, however, it will call checkActivity again which will throw the same exception again if the broker has not sent a PINGREQ yet (which would update with an lastOutboundActivity).

If this happens the broker will eventually drop the client due to a timeout, rightly so, since the client disconnect because if thought that the broker didn't send a PING REQ on time.

Logs from the broker (mosquitto):

1662476342: New client connected from ::1 as Eilinel991376029073 (c1, k60, u'bbb') 1662476342: No will message specified.
1662476342: Sending CONNACK to Eilinel991376029073 (0, 0)
.........................
1662476342: Sending SUBACK to Eilinel991376029073
..........
1662476342: Sending SUBACK to Eilinel991376029073 ...... 1662476402: Received PINGREQ from Eilinel991376029073 1662476402: Sending PINGRESP to Eilinel991376029073 ............. Eiline client System date time change (+1 hour) ........ 1662476462: Socket error on client Eilinel991376029073, disconnecting.

Aditional note: The HighResolutionTimer can be specific via constructor, but only for the async client, the sync one makes no provision for this.

If paho is dependent on timestamps, then the HighResolutionTimer should be a monotonic clock source.

haixiazhang commented 1 year ago

So... are there any fixing about this bug?