pusher / pusher-websocket-java

Pusher Channels client library for Java targeting general Java and Android
Other
308 stars 143 forks source link

Pusher Reconnection issue #197

Open Salim1993 opened 5 years ago

Salim1993 commented 5 years ago

What is the issue?

Pusher will no longer reconnect after long periods of time. Will stay disconnected even after listening to the status change and calling pusher reconnect. Not to sure what to do at this point since we need pusher to be reliable for long periods of hours, for lets say 8 hours. From what I've read in the tickets, it seems like that there is max retry for pusher that will stop trying to connect after 6 attempts. Is this true and is maybe whats causing my issues? Any solutions or tips would be greatly appreciated.

This is the code that I sure to to listen to disconnect statuses, and then reconnect.

https://gist.github.com/Salim1993/acd765e1049bf0a74293a2798c850492

I had to use Gist because I couldn't get the code formatting to work properly. Let me know if you can't access the gist.

Is it a crash report? Submit stack traces or anything that you think would help

Not a crash,just pusher stops responding to events after a while. After looking at the psuher dashboard for events for a long time, i realized that pusher socket disconnects, but never reconnects again. ...

CC @pusher/mobile

Salim1993 commented 5 years ago

Actually maybe its the same as this guys problem:

https://github.com/pusher/pusher-websocket-java/issues/186

Salim1993 commented 5 years ago

Also I think this calling this code over and over again is causing my connections to build up. Need to figure out a better way to make sure pusher stays connected.

kn100 commented 5 years ago

Hi @Salim1993 I've investigate this the following way:

I built a test app that just tries to connect to one of our clusters making use of this library. I blocked my machine from being able to talk to this cluster via the hosts file.

After starting the app, I watched the reconnection attempts build up.

[153] Connection state changed from [DISCONNECTED] to [CONNECTING]
[5185] Connection state changed from [CONNECTING] to [RECONNECTING]
[6192] Connection state changed from [RECONNECTING] to [CONNECTING]
[11199] Connection state changed from [CONNECTING] to [RECONNECTING]
[15200] Connection state changed from [RECONNECTING] to [CONNECTING]
[20204] Connection state changed from [CONNECTING] to [RECONNECTING]
[29205] Connection state changed from [RECONNECTING] to [CONNECTING]
[34214] Connection state changed from [CONNECTING] to [RECONNECTING]
[50214] Connection state changed from [RECONNECTING] to [CONNECTING]
[55224] Connection state changed from [CONNECTING] to [RECONNECTING]
[80225] Connection state changed from [RECONNECTING] to [CONNECTING]
[85236] Connection state changed from [CONNECTING] to [RECONNECTING]
[115236] Connection state changed from [RECONNECTING] to [CONNECTING]
[120249] Connection state changed from [CONNECTING] to [DISCONNECTING]
[120252] Connection state changed from [DISCONNECTING] to [DISCONNECTED]

As we can see here, we attempt to connect numerous times before giving up and moving to the DISCONNECTED state.

Next, I added some code to the onConnectionStateChange handler, which will be triggered when the connection state changes to DISCONNECTED. This code will call pusher.connect() once the state moves to disconnected - in effect making this retry loop permanent.

I left it running for around half hour, at which point we were still (unsurprisingly) attempting to reconnect.

I've noticed two issues though.

  1. The state machine behaves inconsistently. Initially, it cycles between reconnecting and connecting, and after the first set of retries and we move to disconnected, the state machine then cycles between Disconnecting > Disconnected > Connecting. I suspect this is due to the retry counter not being reset or similar. I am going to investigate this next. This will hopefully address #189 and #186. I've made a new mega issue #199.
  2. The retry attempts should be configurable I think. I will investigate making it so. I've made mega issue #200 and this work will hopefully address your issue.

Let me know if you've noticed anything else, I'm actively investigating this today.

kn100 commented 5 years ago

Hi,

I've just discovered that we already support configuring the reconnection attempts! I am going to document this in the README to make it more obvious, but for now, on your PusherOptions object, call these two methods. pusherOptions.setMaxReconnectionAttempts(2).setMaxReconnectGapInSeconds(30);

setMaxReconnectionAttempts() sets the maximum number of attempts a given connection will be retried. setMaxReconnectGapInSeconds() sets the upper bound for our exponential backoff connection strategy. This is the maximum number of seconds the library will wait before retrying a new connection. I recommend setting this to a reasonably high value, by default it is 30 seconds.

There is a bug in the library relating to reconnection logic which means it only works once. I have addressed this in this PR: https://github.com/pusher/pusher-websocket-java/pull/201 and added more documentation in this PR: https://github.com/pusher/pusher-websocket-java/pull/202

I will try to get these merged in asap.

Once again, thanks for your report.

Salim1993 commented 5 years ago

@kn100 Thank for the info. Let me know when the merge happens as I would be glad to check it out. Also I believe that my problem has happened a lot less by upgrading to latest version of pusher library. Still I would like to see the new merge changes as it would allow for more stability in my app.