Closed mbech closed 1 year ago
Hi there. Thanks for the detailed, clear report- that was very welcome.
Just have my hands full today so haven't confirmed, but think you've identified a bug here related to #230.
Basically: #230 addressed the problem on the server side, and looks like we'll need an equivalent strategy on the client end to reliably identify these kinds of sudden disconnects across all browsers/devices. Didn't realise Chrome behaved this way.
Since we already have the server broadcasting pings at the necessary interval, we just need to mod the client with a simple timer to check if it's received any comms from the server w/in a timeout window.
Should be a simple fix, just fully occupied for the next week or two. Will try get to this asap.
I've considered setting up a ping/pong loop from client to server that fires every few seconds, and assume the connection is down if the client misses a few pongs in a row.
That sounds reasonable as a workaround so long.
Cheers!
Quick note if you're not against running a fork so long- it might be easier for you to just use the built-in pings. You'll need to fork to make them available to your app:
Sounds good. Thanks for confirming that I wasn't overlooking an existing config or other solution for this particular situation.
I'll read through the current Sente ws-ping
keep-alive code more closely. If adding a "pong" server response there makes sense I'll create a fork, otherwise it should be straightforward to add to my existing app as a standard cshk-send!
/event-msg-handler
set on a few second interval.
Thank you again for the timely response. This is the first project I've used Sente in and it's been a great experience (especially with the help of the awesome readme and all the examples).
Please note that a pong response isn't what you want since when the connection's down, the client won't receive the ping (pong response).
Had a free moment yesterday so quickly sketched out one approach at https://github.com/ptaoussanis/sente/commit/7ef597103b2c4005155428aef17c9553ef8cbb9f
Entirely untested, but may give you a starting point if you want to do your own pings. Basically, just need to send a client->server ping every x seconds if there haven't been any other messages sent or received in that time. The server can ignore the ping, the point would be for the client to attempt a send that will help it identify a broken connection.
Have pushed [com.taoensso/sente "1.11.0-alpha3"]
to Clojars which adds a client-side :ws-kalive-ms option (defaults to 20s).
Haven't had an opportunity to test yet, please let me know if this addresses your issue? Thanks!
Any update on this? May I close?
Will assume that this is resolved.
In response for the request for feedback (https://github.com/ptaoussanis/sente/issues/259#issuecomment-245492379). Here is a report on some tests I've done:
Summary:
adding :ws-kalive-ms
back to the client (https://github.com/ptaoussanis/sente/commit/d925b66a7adae600ca2dc84cededc4302a010f11 )
has made it possible to detect abnormal disconnects in Chrome, but the feature is currently limited: it takes ~50s to detect a disconnect (even with a :ws-kalive-ms
set to 5s)
[edit: :ws-kalive-ms
has no effect on Firefox or Safari, which trigger a websocket disconnect event independent of messages sentws-kalive-ms
is still necessary for detecting server-side disconnects, see my comment further below]
Chrome seems to trigger a disconnect only on the first message sent that is ~45s after the network actually disconnects
If the intent of :ws-kalive-ms
is to provide an upper-bound on "time to detect disconnect", then the implementation will need to be updated.
Given the behaviour of Chrome, one way to implement such a bound in sente would be for the client to ping
every X seconds, and then trigger a disconnect if no pong
response is detected within Y seconds.
[Edit: removed some preliminary data, added more rigorous data in a comment below]
Thanks for the testing, where did you test it? On localhost, or over the internet? The reason I ask is that I could imagine Safari might have different behaviour over the internet than on localhost if it's listening to the OS network adapter.
@danielcompton I tested over the internet (both deployed to a VPS, and also running in dev on a different machine on the same network).
I will do some tests in a few hours where the server's network connection drops (vs. my previous tests, which were all for the client).
Hey Rafal, thanks a lot for the detailed info! Don't have an opportunity to look into this right away, but will try make it a priority next time I'm doing a batch of open-source work.
Any additional tests/details/conclusions you (or others) can put together in the meantime would of course be a huge help!
Cheers :-)
I did some more testing on disconnect-detection with the network being disabled on the server-side and client-side. Here are my results:
browser | w/ keep-alive | w/o keep-alive | after next message |
---|---|---|---|
Chrome | 45 sec | infinite? (> 4 min) | 0 sec |
Firefox | 10 sec | 10 sec | n/a |
Safari | 0 sec | 0 sec | n/a |
browser | w/ keep-alive | w/o keep-alive | after next message |
---|---|---|---|
Chrome | 45-120 sec | infinite? (> 4 min) | 1 - 3 min |
Firefox | 45-60 sec | infinite? (> 4 min) | 2 - 3 min |
Safari | 25 sec | infinite? (> 4 min) | 55 sec |
Comments:
Firefox sends its own ping
message immediately when the client network is disabled
about 50% of the time on Firefox, instead of gracefully disconnecting, a javascript error occurs:
Error: Invariant violation in `taoensso.timbre:?` [pred-form, val]:
[(string? ?msg-fmt), [Exception... "Unexpected error" nsresult: "0x8000ffff (NS_ERROR_UNEXPECTED)" location: "JS frame :: http://192.168.255.24:5555/js/desktop/out/taoensso/sente.js :: taoensso.sente.ChWebSocket.prototype.taoensso$sente$IChSocket$_chsk_send_BANG_$arity$3 :: line 3823" data: no]]
keep-alive is necessary to detect abnormal server-side network disconnects on all browsers
keep-alive is necessary to detect abnormal client-side network disconnects on Chrome
time-to-detect-disconnection varies greatly (both for the same browser, and between browsers)
the browsers are using combinations of different strategies to trigger websocket disconnection:
Raw data: https://docs.google.com/spreadsheets/d/1OVXbNPN2-TNRBQvnmhKK8IXM_vQ0NiD7ptUEijpQWeo/edit?usp=sharing
[1] https://bugs.chromium.org/p/chromium/issues/detail?id=197841 [2] https://bugs.chromium.org/p/chromium/issues/detail?id=76358
I've updated my post above w/ new data.
With the client-side :ws-keep-alive
, each browser will eventually notice a disconnect (which wasn't the case before).
However, the time to detect varies greatly between browsers (and even for the same browser), so, it would be nice for sente users to have a way to optionally set their own, more aggressive and standardized threshold. However, this is a further extension, so, perhaps it deserves its own issue?
Sorry for the delay handling this, have been swamped lately. Will make this my first priority next time I've got some time to batch work on Sente!
This should hopefully be addressed by the forthcoming v1.18 release.
Will ping when the first public alpha is out, and keep this issue open until there's general consensus that the issue is adequately resolved.
Apologies for the long delay on this!
Think it's safe to now close this.
I'm having difficulty identifying when the websocket connection is down in Chrome. I've looked through the docs and closed issues (some similarities to #230), but wasn't able to find a solution or insights into this particular issue. I hope to confirm that I'm not overlooking an existing Sente feature/option that could help.
Scenario:
Server deployed on Heroku, client logs in, websocket established/first-open, successfully sending messages back/forth. Manually disable wifi on client, cutting the connection.
Expected behavior (both Safari and Firefox):
Within a few seconds (~2-5) the
chsk-state
updates to reflect that the connection is no longer:open?
. Myadd-watch
onchsk-state
picks up the change and renders a notification to the user. Sente begins trying to reconnect every few seconds (longer and longer pauses between attempts). Once the network connection is back up, Sente reestablishes a websocket on next retry attempt.Unexpected actual behavior (Chrome):
No change to
chsk-state
occurs, at least in the 5 minutes I've let it sit after disabling wifi. No event-msgs or errors/warnings appear to be thrown. Once the network is back up, websocket is reestablished and works, but I haven't been able to find anything to watch/capture to inform the client when it's disconnected.Using latest Chrome
Version 52.0.2743.116 (64-bit)
on OSX El Capitan 10.11.6Notes:
After researching, it seems this may be due to Chrome not throwing an "onClose" event when the connection dies, as opposed to Firefox/Safari which do throw that event (which I think Sente captures and reflects in the
chsk-state
)?To get around it, I've considered setting up a ping/pong loop from client to server that fires every few seconds, and assume the connection is down if the client misses a few pongs in a row. But I figured it was worth asking if I overlooked something before building out that workaround, especially since keeping an eye on
chsk-state
works so great in Firefox/Safari.