igrigorik / em-websocket

EventMachine based WebSocket server
http://www.igvita.com/2009/12/22/ruby-websockets-tcp-for-the-browser/
MIT License
1.69k stars 187 forks source link

unacked close behavior needs minor improvement #122

Open sethcall opened 10 years ago

sethcall commented 10 years ago

One of two suggestions:

I prefer the former, based on my own experience building client/server applications. It never make sense, in my experience, to raise exceptions after a close.

If you disagree, OK, that's fine, but to illustrate my problem, I'll need to make my onerror handler a little bit better. Currently I do this:

client.onerror { |error|
  if error.kind_of?(EM::WebSocket::WebSocketError)
    @log.error "websockets error: #{error}"
  else
    @log.error "generic error: #{error} #{error.backtrace}"
  end
  cleanup_client(client)

  client.close # XXX: this causes a error loop; in exactly 60 seconds, I get an onerror message again... and then I'll go through this loop and get the same error again, indefinitely
end

So either I (and every consumer) need to guard against trying to call client.close on a disconnected client (which I don't see how to do yet), or you can help us on this.

Thanks much for your hard work, Seth Call

sethcall commented 10 years ago

By the way, I've encountered this scenario at least 20-30 times since this was released, so it's not too rare in practice.

mloughran commented 10 years ago

Hi Seth. Thanks for writing this up - I hadn't considered this.

The reason I haven't hit into this issue is that I do not call close in my onerror handler. There is no reason to do this - em-websocket takes care of closing the connection in every error case. This should really be noted in the readme since it's non-obvious.

That said, there are still race conditions where close could be closed multiple times, which would result in the onerror callback being called multiple times or erroneously. I therefore agree that close should be changed to be a no-op in the case that close is in-progress.

Whether the lack of close ack should be exposed to the application code at all is a valid question; however other protocol errors (e.g. invalid UTF8) do feed through to onerror, and in my experience it can be useful to discover clients/connections that are not behaving according to spec - so I'd rather err on the side of too much information rather than too little.

sethcall commented 10 years ago

Thanks for the response.

Since I don't need to call close in onerror, that will simplify my logic and remove my current 'un-acked' loop.

In looking through code, it seems onerror may be the only message you get in the case of receive_data throwing an exception. (you won't get an onclose). So I'll need to make sure my code can safely try to cleanup a client in both onclose, and onerror, and deal with the case it's called > 1 time (because I'm pretty sure there are cases where you can get both).

Thanks for your help, Seth

mloughran commented 10 years ago

Not quite - in all cases unbind will be called on the Connection object when the tcp connection closes (whether that was done by the client or server), which causes on_close to be called. So you can safely put all your cleanup logic in on_close; I should also add this to the readme :)

sethcall commented 10 years ago

oh! Thank you for clarifying that; much appreciated.

Kenterfie commented 9 years ago

In this subject i have another problem with onclose. I use a redis counter to count the connected clients. Increment on onopen und decrement on onclose. What i have noticed is, that onclose ist called many times more then onopen and after some hours i have a negative count. in my opinion should every onopen call followed by a onclose call, but it isn't. is this a desired behavior or is it a bug?