SocketCluster / socketcluster-client

JavaScript client for SocketCluster
MIT License
292 stars 91 forks source link

Replace the underlying socket on a connected client #20

Closed hedgepigdaniel closed 6 years ago

hedgepigdaniel commented 8 years ago

We run a cluster of servers running socketcluster and clients are randomly allocated to a particular server on page load. We release changes to our code quite frequently. When this happens, some of our servers go down and are taken off HTTP load balancers in order to be upgraded before going back online. The load balancers allow open connections to end gracefully, but this isn't an option for websocket connections which can stay open for a long time.

My idea for dealing with this problem is for a message to be published to a channel (say 'server-status') when a server is about to go offline. The message would also contain the address of a new server to connect to. The client could then copy all its subscriptions and callbacks to the new socket before disconnecting the old one (and ignoring duplicate messages in the meantime). This way, a server can go down without the user missing out on delivery of any published messages.

This seems to be something best implemented inside socketcluster, since it's necessary to transfer all the registered callbacks on subscriptions as well as just the list of subscriptions. Perhaps a new function could be added to the client socket object, something along the lines of socket.transferConnection({host: 'blerg.com', port: 1337})

Interested to hear thoughts.

jondubois commented 8 years ago

@hedgepigdaniel Yes I like this idea of transferring the full connection/state to a different host.

The current solution is to keep the reboot time as short as possible. If you send a -SIGUSR2 signal to the SC master process, it should kill all workers and they should all reboot with a fresh version of the code (if you don't have much logic in your SC workers, it should take less than a second to reboot). But if you're also waiting on other external stuff to update, then I can see how the downtime could be much longer.

Regarding missed messages, if you want to make sure that your users don't miss anything, you may want to store all messages somewhere on the backend and get a fresh 'snapshot' of the message log whenever the socket connects (or reconnects - The 'connect' event covers both cases) - This should account for any messages which may have been missed while the socket was offline. This approach is quite popular - I know Gitter.im is doing something like this.

If the reboot time is long though, the user will still get a delay before receiving the message log snapshot and also they won't be able to interact with the system (e.g. send new messages) while the reboot is happening (they will timeout from the client-side message buffer if the disconnection lasts too long) - So yes, a socket.transferConnection(...) method or similar would be really useful for those cases :) Definitely a TODO.

jondubois commented 8 years ago

Note that if the reboot duration is shorter than your ackTimeout, then the SC client will buffer any messages emitted from the frontend and will send them all at once to the server when the connection is back up - So I think (in theory), you can get a seamless experience if you can keep the reboot time short (although this is not always possible to do in practice).

jondubois commented 8 years ago

I'm thinking that the solution to this may require an additional layer of abstraction above the SCSocket object. SCSocket itself is an abstraction on top of plain WebSocket connections.

Maybe we should introduce a new object called SCClient (or similar) which behaves like a single SCSocket (and exposes the same methods and properties) but which could manage one or more underlying SCSocket objects at the same time (such as when doing a transfer between two hosts).

There may be use cases were someone may want to always have two or more underlying SCSocket objects connected to two or more different servers at any given time for extra resilience (at the expense of higher bandwidth usage).

SCClient would have to demultiplex the messages from multiple underlying SCSockets in order to avoid handling the same message twice. Also we don't want to accidentally publish the same message twice to two different underlying sockets.

This feels like it might need to be a new project above the current one (such that socketcluster-client would be a dependency) I think it's important not to add too much complexity to socketcluster-client.

jondubois commented 6 years ago

This has already been implemented a long time ago ;p

hedgepigdaniel commented 6 years ago

Ah, good to know! Which API method can I use to transfer the connection to another socket?