jxcore / jxm

Incredibly fast messaging backend for Node.JS / JXcore
http://jxm.io
Other
226 stars 30 forks source link

Disconnect event on the server #17

Open brunobg opened 9 years ago

brunobg commented 9 years ago

Is there anyway that the server may know a client disconnected? Not only a regular close(), but a "the dog chew my network cable" too?

ktrzeciaknubisa commented 9 years ago

There is no internal server's event exposed for this, however one solution comes to my mind: you could call periodically server.sendToAll("ping"), and then each of the clients would send back a "pong" with a call() method for example. If the "pong" does not arrive for a given amount of time, than the server may consider the dog scenario :)

brunobg commented 9 years ago

Ahnnn, this is not a nice solution to the problem. It scales horribly. And this is a rather important feature, since applications often have to do perform some action in that case. Isn't it possible to expose the event for this? Can you point me where that event is internally? May I suggest a "disconnect" event that is called in this case?

obastemur commented 9 years ago

@brunobg In my experience, there is no other 100% working / scalable approach.

Client A connects server B, renewing poll goes to server C (in case the connection was Ajax). Is this user disconnected ? hell no. If we try to follow that user on each server, it would kill the scalability for real this time. OR the same client reloads the page for any reason. Is this user disconnected? no.. BTW; clicking a link (even if it's backed by a script) may raise an unload page event on some older browsers. Is this user disconnected ? no.

JXM is light enough to handle ping-pongs without a problem though. It's all about the interval you might set. For example 5 secs should be reasonable in terms of scalability.

There are many ways for a TCP etc. server that is not able to know if the client connection is dropped.

brunobg commented 9 years ago

I understand that you can't make it work in all cases. Even your ping/pong also has its share of problems. The inverse argument is valid, as well: there are many cases that can be handled well. If you are using websockets, for example you should get an even when the socket is closed. I'm not sure if you already implement a keepalive, but if you do you can detect disconnections more or less well -- let's say, "as well as possible". Other libraries support it.

Either way, it is a very relevant case to many applications and this is what I'm suggesting here. JXM is a very nice and efficient library, with a clean API. So it makes a lot of sense to get disconnect events out of the box, even if with caveats.

obastemur commented 9 years ago

Actually, as far as I remember we have keep alive in place for long polls. Would you like to contribute on this feature?

brunobg commented 9 years ago

Yes, I'd like to help, but I'd appreciate a few pointers if possible.

First I suppose it would be better if you guys designed the API -- I suggest a "disconnect" event, very similar to the "subscribe" event.

Second, let us design how we can discover that a disconnection happened as well as possible: websockets should generate a disconnect event themselves, AJAX will return error or timeout, long polls have this keepalive you mentioned.

Finally if you could give me a general idea of where I should look and change the code, to make sure I don't ignore some relevant part.

Also, on the client, is onError already called if there's a network problem?

obastemur commented 9 years ago

@ktrzeciaknubisa can you create structure for disconnect on the master? also please pin point the relevant parts from the code ?

Also, on the client, is onError already called if there's a network problem?

try listening both onclose and onerror, they should definitely help.

ktrzeciaknubisa commented 9 years ago

I'm looking at the code, but it doesn't seem like we have something that we can use for 'disconnect' event. Also the client does not react on connection drop, neither for websocket nor xdomain/xmlhttp requests.

I'm taking it on my desk, hopefully I'll come out soon with some proposal.

brunobg commented 9 years ago

Actually I tested here and the client noticed the disconnection when I kill the server (OnError/OnClose are called), using websockets.

I'll wait for your ideas, ready to help.

ktrzeciaknubisa commented 9 years ago

Yeah, I saw that, and that's immediate for websocket connections. But not all of the browsers can use that (try to use some IE < 10 or stock browser on android. You can also temporarily disable it by setting jxcore.SocketDisabled = true on _jx_browserclient.txt#533 - or in the end of the file)

Also stopping the server apparently is not equal to the "chewing dog" thing. Clients on another machine did not react immediately (depends from browser to browser) when I dropped the connection while server was still working. They try to reconnect until some point so as @obastemur mentioned onclose and onerror might be of use here.

I'm still testing different browsers/different protocols.

brunobg commented 9 years ago

Ah, this is something that I don't quite understand: when do clients try to reconnect? I know there's a flag passed to OnClose/OnError, but in my "kill the server" test clients never try to reconnect (again, websockets in that case). Is there a way to force reconnections? We're implementing with a setInterval() but there may be a better way.

UPDATE: it actually tries to reconnect 5 times then stops trying, and apparently never tries again, correct?

ktrzeciaknubisa commented 9 years ago

I'm stopping/starting a server and websocket client is able to reconnect successfully, that is if disconnection time is not too long. For the browser client you can take a look on jxcore.ReConnect() method. Once OnClose(true) event was called, jxcore makes 5 reconnection attempts, each between 3000 ms, so basically if connection comes back within 15 secs - it should reconnect just fine.

brunobg commented 9 years ago

I saw jxcore.ReConnect(), thanks. Is this limit of 5 attempts every 3000ms configurable externally? This would be useful. We have an app here that should reconnect with the server whenever it comes back to life, and I'd like to set it to "infinite times" every 30 seconds.

ktrzeciaknubisa commented 9 years ago

In current shape it is not really configurable (although you can modify your sources - it's in jxcore.ReConnect()). However once reconnection attempts give up after 5, the OnClose() is called one more time, but this time with a false / undefined value. Then you can call jxcore.ReConnect(), like:

        jxcore.OnClose = function (reconnecting) {
            if (!reconnecting) {
                jxcore.ReConnect();
                return;
            }
        };

You should also apply your 30 sec logic.


We are careful with adding new features (like configurable values) since it would require to apply this for all jxm client platforms (js, java, node). API should stay consistent for all of them. Not to mention, that it involves much more testing. Thus as far as disconnect event for the client side is concerned - I suggest to stay with OnClose but to improve it for network drop as much possible. For the server-side however, it is more logical to add onClientDisconnect or similar, so we'll try to approach this.

brunobg commented 9 years ago

Oh, I understand that. I think the client is fine, I was just asking if it was possible to change but undocumented. Thanks :)

ktrzeciaknubisa commented 9 years ago

After a bunch of tests and researches I've found myself walking a big round circle and coming back to square one. I haven't discover any method more reliable than ping-pong approach. You can listen to any socket's or xmlhttp object's event you want (onerror, onclose) but often they just don't fire soon enough. For browser's websocket - the interval between connection drop and event fired varied between 2 secs to 20. You can disable network connection in your OS and yes, this is recognized quite fine, but whenever you try to unplug the net - you can do nothing else but wait for the event. God knows when it happens.

Having that said, I made myself a quick ping-pong implementation to jxm and it's very reliable, yet still simple. I propose it this way:

  1. Client would probe connection once in a while (by sending internal call to the server).
  2. The interval value (we can call it checkAliveInterval or something) would be configurable on the server side (config file) and sent to client just once - when the connection is established for the first time.
  3. Thus, the server would know, how often it should expect a message from client. If it takes longer, server may raise 'disconnect' event per client.
  4. Client on the other hand, if did not receive callback from server, may call OnClose(false) leaving to the developer the decision whether to reconnect or not .
  5. If the interval value is not set on server's config - jxm works just as before.

As I've mentioned I already have working prototype for browser clients and it doesn't even interfere much with the code. It will also be easy to apply to all client platforms and will the same reliable.

What do you guys think about this?

brunobg commented 9 years ago

You guys know jxm much better than I do. All I can contribute with is a theoretical consideration and my personal opinion.

First, I understood you already had a keepalive, which would make another ping-pong redundant no?

It would worry me if I had something like >10k clients and a `checkAliveInterval' of a few seconds (I'd consider 10 seconds a reasonable timeout as an application developer for an "average application", and users probably would prefer less than that). I also would worry about the "garbage collection" on the server: I suppose you'd have to foreach() all connections every now and then to see which ones didn't get a ping. This could add a hiccup, as GCs always do, and in my experience 10k JS timers would not be a better solution. Even though your benchmarks are great, this algorithm is linear with the number of clients.

Everything depends on the application, also. In any application that is not real time and uses disconnect essentially for cleanup, if I had a choice between waiting for a while or adding a keep alive that would increase networking and possibly add GC pauses I'd rather wait, even if it took a longer time. In real time applications I want to know as soon as possible that something is wrong and treat it immediately. In applications that I need to be sure that information was transferred I suppose that neither would be much better: I'd need to reconnect and use the application logic to know if all data was synchronized anyway.

Application logic could also do your idea. For example, if I expect my clients not to be idle for too long I could store their last message timestamp somewhere and implement a "free ping-pong". In fact, I wonder if you couldn't get that information somehow from free in SendToGroup: don't clients return anything? Because AJAX polling would work fine since it's always pinging, websockets would handle a disconnection since TCP has that awful but existing alive (even if it takes an unpredictable time), long polling requires a reconnection every minute or so, so if a connection is not closed and reopened you can assume your client died. Am I forgetting some other protocol?

My two cents are: your proposal is quite useful but it may hurt performance significantly, and people using it may be exactly the ones who are looking for a fast library that does not have large fluctuations in delivery time (that is me hehehe). If some information is available, however imperfect it may be (e.g. websockets would work but timeout is unpredictable) I'd expose that too. I can see plenty of scenarios that I'd rather have a "eventually this will timeout but no extra traffic".