taoensso / sente

Realtime web comms library for Clojure/Script
https://www.taoensso.com/sente
Eclipse Public License 1.0
1.74k stars 196 forks source link

Add way to handle WebSocket.onerror events #214

Closed drhops closed 8 years ago

drhops commented 8 years ago

If during websocket initialization a server is down or returns 403 due to invalid auth, we'll hit the onerror handler here:

(aset "onerror"
  (fn [ws-ev] (errorf "WebSocket error: %s" ws-ev)))

https://github.com/ptaoussanis/sente/blob/2db8e6e43dcb1b55583fd5a4ce6a661583b1878d/src/taoensso/sente.cljx#L820

One could override onerror like the following, but it's hacky and outside of the context of sente's other retry logic:

(doto  (deref (:socket_ chsk))
  (aset "onerror" (fn [ws-ev] (...))))

Should there be a way to override that behavior (e.g. retry) more straightforwardly?

ptaoussanis commented 8 years ago

Hi Daniel,

Could you possibly provide some other examples of logic that you'd want to add to the error handler? Retries are already done automatically.

drhops commented 8 years ago

Hi Peter,

Appreciate the quick reply and good point retry is already covered. I'd like to also change the log level and show different UI to the user upon error. Is there a good way to handle those? Also, thank you for the great library!

ptaoussanis commented 8 years ago

You're very welcome, no problem.

I'd like to also change the log level and show different UI to the user upon error. Is there a good way to handle those?

Could you be a little more concrete + describe the kind of error you'd like to show the user? There's no hook to onerror since it's an implementation detail + generally isn't something you want the user to know anything about.

If you're looking for general connection status/change info, that's all published properly to the chsk channel. Does that make sense?

yogthos commented 8 years ago

I'm actually trying to do something similar as well here. In my case the user enters data on the client, then on blur I fire an event to the server. So in case where the connection is interrupted I need to pop up a modal to tell the user that their connection to the server is gone and data won't be persisted.

I was trying to use :error-handler key, but that doesn't seem to get fired in case where I stop the server:

(sente/start-chsk-router!
                   (:ch-recv @connection)
                   (event-msg-handler
                     {:message   message-handler
                      :state     handshake-handler
                      :handshake state-handler})
                   {:error-handler (fn [e event-msg] (println "error:" e))})

Sente will log the error:

WebSocket connection to 'ws://localhost:3001/ws?client-id=551ca9aa-c464-493d-bfdf-40780ad7ee32' failed: Error in connection establishment: net::ERR_CONNECTION_REFUSED
core.cljs:181 ERROR [taoensso.sente:820] - WebSocket error: [object Event]taoensso.timbre.appenders.core.console_appender.cljs$core$IFn$_invoke$arity$variadic @ core.cljs:181(anonymous function) @ timbre.cljs:458(anonymous function) @ core.cljs:6154cljs.core.PersistentArrayMap.cljs$core$IKVReduce$_kv_reduce$arity$3 @ core.cljs:6154cljs$core$_kv_reduce @ core.cljs:594cljs$core$reduce_kv @ core.cljs:2294taoensso$timbre$_log_BANG_ @ timbre.cljs:415(anonymous function) @ sente.cljs:820

What the recommended way to hook into that and provide an additional custom error handler for that event.

ptaoussanis commented 8 years ago

Hi Dmitri,

Is there a reason the standard chsk state change events don't cover your use case? If you watch your channel socket for :chsk/state events, you'll see events published for things like connection drops, etc.

ptaoussanis commented 8 years ago

So in case where the connection is interrupted I need to pop up a modal to tell the user that their connection to the server is gone and data won't be persisted.

Also, just to clarify: if you're sending a synchronous request to the server, the standard way of knowing if something went wrong is to check the callback value. The callback value will indicate if there was a broken connection, request timeout, etc. Does that help / make sense?

drhops commented 8 years ago

Hi Peter,

Thank you for your responses. A couple of concrete use cases that would be helpful to support:

I don't believe chsk-state has the granularity to support these yet, and am thinking that extending it to have an error state or exposing onerror could. Do you think either of these would be a good addition, or are there alternate approaches you're aware of?

yogthos commented 8 years ago

Hi Peter,

Thanks for the quick response, that definitely helps. I think I just missed the state event for dropped connection.

ptaoussanis commented 8 years ago

@yogthos

No problem. Reason you'd want the state event rather than trying to hook directly into WebSocket handlers is that the state events basically provides a consistent, dependable user-level API that can pave over underlying implementation issues like protocol differences, browser issues, etc.

For example, you wouldn't want to have to handle onerror messages for each failed retry attempt - so the state channel just publishes the state deltas that would actually be useful at a user level. If that makes sense.

ptaoussanis commented 8 years ago

@drhops

Show a UI bar telling the user the app is unusable when the socket can't connect

You can use the :chsk/state events for this, unless I'm misunderstanding?

If using token auth around the /chsk endpoint and an error is encountered, check and refresh the token.

/chsk or auth request issues are something that you can + should usu. be handling at your auth request - again, unless I'm misunderstanding.

For example: if you have an auth procedure that establishes a session through an Ajax auth request (as in the reference example) - then you can deal with auth problems directly when you receive the auth response? Does that make sense? Would suggest checking out the ref example if you haven't since that might help clarify.

Cheers :-)

drhops commented 8 years ago

@ptaoussanis

For the UI bar, which :chsk/state event would i check? (In the case of a server down, the socket never connects, so there's no dropped connection event?)

For the auth case, assume the client already successfully received a long-lived token (not session based), and the /chsk endpoint is simply checking that token to make sure it hasn't expired and returning 403 if so.

I could easily be missing something, but don't believe these are covered currently. Appreciate your thoughts and thanks again!

ptaoussanis commented 8 years ago

For the UI bar, which :chsk/state event would i check?

First successful connection, you'll see [:chsk/state {:first-open? true}]. Each event will contain an :open? flag that indicates if the channel socket connection is currently good. If :open? ever becomes false, you can update your UI.

Please see the ref example project for more info on the :chsk/state events and what info they expose.

and the /chsk endpoint is simply checking that token to make sure it hasn't expired and returning 403 if so.

Not sure exactly what you mean by this exactly, would need a more concrete example. Generally speaking- if you want to do auth via an expiring token, you'd need+want to handle that at an API call level.

I.e. at your event handlers check for your token state, etc. If the token's expired, respond to the request with a notification that the client can identify (something like :my-app/token-has-expired). Then the client can throw up a UI to request reauth, etc. Does that help / make sense? Basically, if it's an adhoc/ app-level auth concern, I'd suggest implementing via standard event handling rather than trying to shoehorn into protocol-level auth.

You can wrap server or client side event handlers in middleware easily enough to get whatever behaviour you like. Trying to tie into the low-level WebSocket onerror handler here would be, I think, a bad idea. Actually, onerror wouldn't fire for the kinds of auth problems you're talking about.

Please note that there's a summary of auth approaches at https://github.com/ptaoussanis/sente/issues/118#issuecomment-87378277

Sorry if this was a bit terse, written in a hurry since I'm just handling some urgent stuff atm.

yogthos commented 8 years ago

Yeah this approach makes a lot of sense. The state events allow watching for changes in state and reacting to those. I definitely agree that it provides a good abstraction for this.

drhops commented 8 years ago

First successful connection, you'll see [:chsk/state {:first-open? true}]. Each event will contain an :open? flag that indicates if the channel socket connection is currently good. If :open? ever becomes false, you can update your UI.

I'm asking about the scenario in which the socket never connects in the first place.
1) User opens the client app 2) Client app attempts to initialize socket via /chsk, but the server is down and returns 404 3) onerror is called, :first-open? remains false, :open? remains false. The socket was never successfully opened in the first place, so none of the state events are fired. 4) Client app should inform the user, but the error event isn't exposed?

Not sure exactly what you mean by this exactly, would need a more concrete example. Generally speaking- if you want to do auth via an expiring token, you'd need+want to handle that at an API call level.

Auth scenario: 1) User opens the client app 2) Client app calls an auth API and successfully receives a valid token, which is persisted on the client 3) User closes the client app 4) Time passes and the token expires 5) User opens the client app 6) Client app attempts to initialize socket via /chsk, but the token is expired and server returns 403 7) onerror is called, :first-open? remains false, :open? remains false. The socket was never successfully opened in the first place, so none of the state events are fired. 8) Client app should check the token and refresh, but the error event isn't exposed?

I think the summary of auth approaches in #118 assume a web context with session-based auth, whereas this scenario is for a native context with token-based auth.

Do these scenarios make sense? Thanks again and appreciate your time!!

ptaoussanis commented 8 years ago

Alternative (recommended) scenario #1:

1) User opens the client app 2) Client app attempts to initialize socket via /chsk, but the server is down and returns 404 3) :first-open? remains false, :open? remains false. The socket was never successfully opened in the first place, so none of the state events are fired. 4) [Behind the scenes, reconnection attempts are automatically being made] 5) In the meantime, client app can see that :first-open? is still false so inform users that there's no connection.

So the recommended approach is: start with a UI that informs the user that the app is disconnected/connecting. Iff the chsk state is open, remove that message.

Auth scenario:

6) Client app attempts to initialize socket via /chsk, but the token is expired and server returns 403

How is the server returning 403? That's not a standard part of the /chsk implementation. I recommend that you do application-level auth via an application-level auth req+resp. If your application-level auth token has expired, then your application-level API calls should fail with an appropriate application-level error code/message.

Trying to tie application-level auth into the implementation-level socket might work in some cases, but is a pattern I'd like to discourage since it's encumbering w/o (as far as I can see so far) any compelling advantages.

Can you articulate why you're not fond of the solution I suggested (event handling middleware)?

Just a heads-up: unfortunately might not be able to continue this discussion for a few days while I have some urgent work I need to concentrate on, sorry.

drhops commented 8 years ago

Hi Peter,

A couple of thoughts:

So the recommended approach is: start with a UI that informs the user that the app is disconnected/connecting. Iff the chsk state is open, remove that message.

I think "connecting/disconnected" is semantically different from "error" because in the former case, there's an expectation that the socket will soon connect, whereas in the latter it is known that it won't. It is good to be able to inform the user when there's an error with the service so that he doesn't keep trying and get frustrated with our apps.

Can you articulate why you're not fond of the solution I suggested (event handling middleware)?

I agree with you event-level auth is a valid approach, but if we can't handle 403s from the /chsk endpoint, then we have to leave the endpoint unsecured until after socket initialization (even session-based auth via ring middleware can return 403s), which I believe could open up a DOS attack vector. Authenticating the client during the initial websocket handshake is recommended by heroku for example: https://devcenter.heroku.com/articles/websocket-security#authentication-authorization

Just a heads-up: unfortunately might not be able to continue this discussion for a few days while I have some urgent work I need to concentrate on, sorry.

Hope these points make sense and please take your time responding. Appreciate the discussion!

theasp commented 8 years ago

May I suggest using a timeout to notify the user after a few seconds "It's taking a long time to connect to the server"? Connecting to a webserver that is actually down (as in you have a route to it's IP, but nothing is answering) can take a very long time.

drhops commented 8 years ago

Hi @theasp, while that would handle server timeouts specifically, I think it's cleaner for Sente to set its own timeouts and then simply expose errors via an onerror-fn or an :error chsk-state, which has the advantage of exposing all types of server errors (e.g. 503s) as soon as they're known. What do you think about that solution?

ptaoussanis commented 8 years ago

A couple of thoughts:

Hey Daniel, I appreciate your thoughts and open-mindedness on this. I don't disagree with anything you've proposed, exactly - but do have in mind some tradeoffs that motivate the current design and that might be worth explaining.

Just in the middle of a project launch atm, but promise I'll try get back to you on this when I have an opportunity.

In the meantime, would suggest either following @theasp's suggestion, or perhaps running from your fork in the meantime until we get a chance to talk?

Thanks again for the patience, cheers! :-)

ptaoussanis commented 8 years ago

Quick update: haven't reached any firm conclusions on this yet. But till then, happy to keep the commit at https://github.com/ptaoussanis/sente/commit/f0785fd8b892c49061fd92d232026f116707dd0c in as experimental so that I can get a sense of how folks end up using this.

Thanks again for the PR + all the input on this.

ptaoussanis commented 8 years ago

@drhops Hi Daniel, would like to come back to this for a moment since you mentioned that you've been satisfied with the new [:chsk/ws-error <onerror-ev>] event.

Had reason to just look into this a little more, and was surprised to see that the underlying onerror event actually doesn't seem to include any useful information, Ref. https://www.w3.org/TR/websockets/#concept-websocket-close-fail (e.g. cause, etc.).

So this has me curious: in what way is this event actually useful to you? How are you actually using it? All the info I'd consider useful for debugging seems to be available only in the onclose event.

Am I missing something obvious?

danielcompton commented 8 years ago

Will get back to you on this next week.

ptaoussanis commented 8 years ago

Gave this some more thought and have decided to drop the :chsk/ws-error event.

Instead, will now attach :last-ws-error and :last-ws-close keys to the standard state atom.

If you're interested in this info, you can either watch the standard state atom for changes - or handle the standard :chsk/state event, which is now of the form [:chsk/state [<old-state-map> <new-state-map>]].

This new approach provides more info, provides an atomic snapshot of all info, is a lot more flexible, and doesn't need the introduction of a new :chsk/ws-error event type.

Sente's own auto websocket->ajax fallback logic is now implemented entirely using this same (state snapshot) info.

See https://github.com/ptaoussanis/sente/releases/tag/v1.9.0-beta3 for full release info and https://github.com/ptaoussanis/sente/commit/9a7e172c02b7e0c4c7035b8f099153ee6d83b75d for the relevant commit.

Feel free to reopen if anyone still has any other feedback/requests. Cheers!

luposlip commented 8 years ago

Nice approach!