fusesource / mqtt-client

A Java MQTT Client
http://mqtt-client.fusesource.org/
Apache License 2.0
1.27k stars 368 forks source link

Client does not recover from lost network connection #28

Open mfeingold opened 10 years ago

mfeingold commented 10 years ago

Steps to replicate:

  1. Establish the connection: MQTT mqtt = new MQTT(); MQTT mqtt = new MQTT(); mqtt.setHost(host, port); mqtt.setUserName(user); mqtt.setPassword(password); mqtt.setClientId("TestClient_1"); mqtt.setCleanSession(false); // durable topic CallbackConnection connection = mqtt.callbackConnection(); etc...
  2. run the program and let it establish a connection. You can watch pings coming and going
  3. pull the wire and the connection time out on both ends
  4. re-establish the connection

At this point the client tried to reconnect. In tracer I can see the connecting message, 'login accepted' message and it even receives the messages that were waiting in the queue, but then out of nowhere comes another connect message and it tries to login again and this time the login is rejected - obviously because the connection uses the same clientID.

I tried to run this scenario on multiple computers and effects differ. Sometimes it just works.

Also if instead of pulling the wire I close / restart broker, it works fine.

Guys, this seems to be a showstopper. I like your client, but I will have to switch to paho if I cannot resolve this promptly

behrad commented 9 years ago

@mfeingold what happened to your case? did you switched to Paho? or new versions of mqtt-client ?!

mfeingold commented 9 years ago

I switched to paho

behrad commented 9 years ago

we are using mqtt-client in beta production on Android devices, not seen a problem yet when switching wifi/data on/off... (no wire to pull out!)

I like the auto-reconnection feature which paho lacks.

mfeingold commented 9 years ago

We use paho in prod for over a year now. Went through some trouble, but we are in a good shape now we have 1500 Android devices connected since last fall and plan to deploy up to 20000 by the end of the year.

As to the reconnect logic - yes it is not there. I wrote the reconnect logic myself as well as temporary storage for the messages sent out while the connection is down.

behrad commented 9 years ago

are your contributions open source @mfeingold ?

mfeingold commented 9 years ago

I just made an older version of my code public here. Our current version is too tightly coupled with the customer application. I keep hoping I will find time to port the changes back. I might be your request will finally push me over :).

Also the documentation is sorely lacking, so feel free to ask questions

behrad commented 9 years ago

:+1: @mfeingold waiting to here more from it.