eclipse / paho.mqtt.java

Eclipse Paho Java MQTT client library. Paho is an Eclipse IoT project.
https://eclipse.org/paho
Other
2.12k stars 884 forks source link

Disconnected Message Buffer sometimes overruns the in flight window #582

Closed jpwsutton closed 5 years ago

jpwsutton commented 6 years ago

Please fill out the form below before submitting, thank you!

If this is a bug regarding the Android Service, please raise the bug here instead: https://github.com/eclipse/paho.mqtt.android/issues/new

After narrowing down in tests, I've identified the issue causing the OfflineBufferingTest class to intermittently fail in Travis and Jenkins.


20180910 082152.949 
20180910 082152.950 *************************************************************
20180910 082152.950 * OfflineBufferingTest.testManyMessageBufferAndDeliver
20180910 082152.950 *************************************************************
20180910 082152.963 [CPMS Proxy] - Enabling Proxy
20180910 082152.964 [CPMS Proxy] - Waiting for incoming connection..
20180910 082152.965 [CPMS Proxy] - Client Opened Connection to Proxy...
20180910 082152.965 Proxy: Proxy Connected to Server
20180910 082152.971 First Connection isConnected: true
20180910 082152.971 [CPMS Proxy] - Disabling Proxy
20180910 082152.971 [CPMS Proxy] - killOpenSockets Called.
20180910 082152.971 [CPMS Proxy] - IOException in server to client stream: Socket closed
20180910 082152.971 [CPMS Proxy] - IOException in client to server stream: Socket closed
20180910 082152.971 [CPMS Proxy] - 
20180910 082153.071 Proxy Disconnect isConnected: false
20180910 082153.083 [CPMS Proxy] - Enabling Proxy
20180910 082153.083 [CPMS Proxy] - Waiting for incoming connection..
20180910 082153.975 [CPMS Proxy] - Client Opened Connection to Proxy...
20180910 082153.976 Proxy: Proxy Connected to Server
20180910 082154.087 Proxy Re-Enabled isConnected: true
20180910 082154.207 null: Error occurred attempting to publish buffered message due to disconnect. Exception: 32,202:Too many publishes in progress. <---- Important Bit!
org.eclipse.paho.client.mqttv3.test.utilities.MqttV3Receiver:146  No message received in waitMilliseconds=60000
20180910 082259.111 [CPMS Proxy] - Disabling Proxy
20180910 082259.112 [CPMS Proxy] - killOpenSockets Called.
20180910 082259.113 [CPMS Proxy] - IOException in client to server stream: Socket closed
20180910 082259.113 [CPMS Proxy] - IOException in server to client stream: Socket closed
20180910 082259.114 [CPMS Proxy] - 

It looks like the Disconnected Message Buffer accidentally its the In flight message limit when publishing messages after reconnecting, this is in spite of the check that exists in ClientComms.ReconnectDisconnectedBufferCallback.publishBufferedMessage where there is a specific that will then Yield to other threads if we have hit the window.

The problem is that even with this check, the callback still manages to publish a message that exceeds the max inflight limit.

jpwsutton commented 6 years ago

Fixing the test to have an max inflight window of 100 is a temporary fix to this issue. As this issue hasn't been picked up by anyone in their own applications as far as I can see, I'm going to put this on the backlog for the next release. Once this has been fixed though, the testManyMessageBufferAndDeliver test needs to have the max inflight setting removed.

jpwsutton commented 6 years ago

I've added a possible fix in the v5 client as well, this will prevent publishes being sent until there are 4 spaces in the inflight window. This seemed to work for me, we could dynamically size this based on the qos of the message being sent?

icraggs commented 5 years ago

That change doesn't fix the issue in the V5 client, I assume that setting maxInflight in the test will avoid the issue, but not fix it.

icraggs commented 5 years ago

I've added a fix for this in the V5 client - a similar approach should work for V3 too. Basically it retries the publish when this condition occurs.

icraggs commented 5 years ago

Released in 1.2.1