taoensso / sente

Realtime web comms library for Clojure/Script
https://www.taoensso.com/sente
Eclipse Public License 1.0
1.74k stars 196 forks source link

How to detect loss of network connection client-side in Chrome #259

Closed mbech closed 1 year ago

mbech commented 8 years ago

I'm having difficulty identifying when the websocket connection is down in Chrome. I've looked through the docs and closed issues (some similarities to #230), but wasn't able to find a solution or insights into this particular issue. I hope to confirm that I'm not overlooking an existing Sente feature/option that could help.

Scenario:

Server deployed on Heroku, client logs in, websocket established/first-open, successfully sending messages back/forth. Manually disable wifi on client, cutting the connection.

Expected behavior (both Safari and Firefox):

Within a few seconds (~2-5) the chsk-state updates to reflect that the connection is no longer :open?. My add-watch on chsk-state picks up the change and renders a notification to the user. Sente begins trying to reconnect every few seconds (longer and longer pauses between attempts). Once the network connection is back up, Sente reestablishes a websocket on next retry attempt.

Unexpected actual behavior (Chrome):

No change to chsk-state occurs, at least in the 5 minutes I've let it sit after disabling wifi. No event-msgs or errors/warnings appear to be thrown. Once the network is back up, websocket is reestablished and works, but I haven't been able to find anything to watch/capture to inform the client when it's disconnected.

Using latest Chrome Version 52.0.2743.116 (64-bit) on OSX El Capitan 10.11.6

Notes:

After researching, it seems this may be due to Chrome not throwing an "onClose" event when the connection dies, as opposed to Firefox/Safari which do throw that event (which I think Sente captures and reflects in the chsk-state)?

To get around it, I've considered setting up a ping/pong loop from client to server that fires every few seconds, and assume the connection is down if the client misses a few pongs in a row. But I figured it was worth asking if I overlooked something before building out that workaround, especially since keeping an eye on chsk-state works so great in Firefox/Safari.

ptaoussanis commented 8 years ago

Hi there. Thanks for the detailed, clear report- that was very welcome.

Just have my hands full today so haven't confirmed, but think you've identified a bug here related to #230.

Basically: #230 addressed the problem on the server side, and looks like we'll need an equivalent strategy on the client end to reliably identify these kinds of sudden disconnects across all browsers/devices. Didn't realise Chrome behaved this way.

Since we already have the server broadcasting pings at the necessary interval, we just need to mod the client with a simple timer to check if it's received any comms from the server w/in a timeout window.

Should be a simple fix, just fully occupied for the next week or two. Will try get to this asap.

I've considered setting up a ping/pong loop from client to server that fires every few seconds, and assume the connection is down if the client misses a few pongs in a row.

That sounds reasonable as a workaround so long.

Cheers!

ptaoussanis commented 8 years ago

Quick note if you're not against running a fork so long- it might be easier for you to just use the built-in pings. You'll need to fork to make them available to your app:

https://github.com/ptaoussanis/sente/blob/ab45ef8839b0df17acd88cc2fcc992963e57b1c1/src/taoensso/sente.cljc#L1044

mbech commented 8 years ago

Sounds good. Thanks for confirming that I wasn't overlooking an existing config or other solution for this particular situation.

I'll read through the current Sente ws-ping keep-alive code more closely. If adding a "pong" server response there makes sense I'll create a fork, otherwise it should be straightforward to add to my existing app as a standard cshk-send!/event-msg-handler set on a few second interval.

Thank you again for the timely response. This is the first project I've used Sente in and it's been a great experience (especially with the help of the awesome readme and all the examples).

ptaoussanis commented 8 years ago

Please note that a pong response isn't what you want since when the connection's down, the client won't receive the ping (pong response).

Had a free moment yesterday so quickly sketched out one approach at https://github.com/ptaoussanis/sente/commit/7ef597103b2c4005155428aef17c9553ef8cbb9f

Entirely untested, but may give you a starting point if you want to do your own pings. Basically, just need to send a client->server ping every x seconds if there haven't been any other messages sent or received in that time. The server can ignore the ping, the point would be for the client to attempt a send that will help it identify a broken connection.

ptaoussanis commented 8 years ago

Have pushed [com.taoensso/sente "1.11.0-alpha3"] to Clojars which adds a client-side :ws-kalive-ms option (defaults to 20s).

Haven't had an opportunity to test yet, please let me know if this addresses your issue? Thanks!

ptaoussanis commented 8 years ago

Any update on this? May I close?

ptaoussanis commented 8 years ago

Will assume that this is resolved.

rafd commented 7 years ago

In response for the request for feedback (https://github.com/ptaoussanis/sente/issues/259#issuecomment-245492379). Here is a report on some tests I've done:

Summary:

If the intent of :ws-kalive-ms is to provide an upper-bound on "time to detect disconnect", then the implementation will need to be updated.

Given the behaviour of Chrome, one way to implement such a bound in sente would be for the client to ping every X seconds, and then trigger a disconnect if no pong response is detected within Y seconds.

[Edit: removed some preliminary data, added more rigorous data in a comment below]

danielcompton commented 7 years ago

Thanks for the testing, where did you test it? On localhost, or over the internet? The reason I ask is that I could imagine Safari might have different behaviour over the internet than on localhost if it's listening to the OS network adapter.

rafd commented 7 years ago

@danielcompton I tested over the internet (both deployed to a VPS, and also running in dev on a different machine on the same network).

I will do some tests in a few hours where the server's network connection drops (vs. my previous tests, which were all for the client).

ptaoussanis commented 7 years ago

Hey Rafal, thanks a lot for the detailed info! Don't have an opportunity to look into this right away, but will try make it a priority next time I'm doing a batch of open-source work.

Any additional tests/details/conclusions you (or others) can put together in the meantime would of course be a huge help!

Cheers :-)

rafd commented 7 years ago

I did some more testing on disconnect-detection with the network being disabled on the server-side and client-side. Here are my results:

Time for Client to Notice an Abnormal Client-side Network Disconnect

browser w/ keep-alive w/o keep-alive after next message
Chrome 45 sec infinite? (> 4 min) 0 sec
Firefox 10 sec 10 sec n/a
Safari 0 sec 0 sec n/a
screen shot 2017-03-02 at 12 07 26 pm

Time for Client to Notice an Abnormal Server-side Network Disconnect

browser w/ keep-alive w/o keep-alive after next message
Chrome 45-120 sec infinite? (> 4 min) 1 - 3 min
Firefox 45-60 sec infinite? (> 4 min) 2 - 3 min
Safari 25 sec infinite? (> 4 min) 55 sec
screen shot 2017-03-02 at 12 03 06 pm screen shot 2017-03-02 at 12 02 57 pm

Comments:

Raw data: https://docs.google.com/spreadsheets/d/1OVXbNPN2-TNRBQvnmhKK8IXM_vQ0NiD7ptUEijpQWeo/edit?usp=sharing

[1] https://bugs.chromium.org/p/chromium/issues/detail?id=197841 [2] https://bugs.chromium.org/p/chromium/issues/detail?id=76358

rafd commented 7 years ago

I've updated my post above w/ new data.

With the client-side :ws-keep-alive, each browser will eventually notice a disconnect (which wasn't the case before).

However, the time to detect varies greatly between browsers (and even for the same browser), so, it would be nice for sente users to have a way to optionally set their own, more aggressive and standardized threshold. However, this is a further extension, so, perhaps it deserves its own issue?

ptaoussanis commented 7 years ago

Sorry for the delay handling this, have been swamped lately. Will make this my first priority next time I've got some time to batch work on Sente!

ptaoussanis commented 1 year ago

This should hopefully be addressed by the forthcoming v1.18 release.

Will ping when the first public alpha is out, and keep this issue open until there's general consensus that the issue is adequately resolved.

Apologies for the long delay on this!

ptaoussanis commented 1 year ago

Think it's safe to now close this.