Partial fix for failing build

bajtos commented 11 years ago

Fixed a bug in ClusterStore implementation and added debug logs to help troubleshoot the problems.

@sam-github Please review.

The build is still failing due to a race condition caused by the socket.io's internal design. See lib/manager.js L805 to L808.

Details:

Establishing a socket.io connection requires two HTTP requests: the first one preforms handshake (and authorization), the second one creates the websocket connection.
Given a perfect round-robin cluster, each of those two requests is served by a different worker.
The socket.io server may return HTTP response for the handshake request before other workers were notified (see L805-L808).
The second request thus may arrive at the other worker before this worker has received the notification about a new session. Such request is refused as not authorized.
This behaviour is happening regularly at our Debian CI slaves and can be easily reproduced at any machine by adding a ~200ms delay to ClusterStore.prototype.publish().

I don't see how this problem can be fixed with the current internal design, as the pub/sub concept does not provide means for the publisher to wait until all subscribers received the notification.

Thinking about the problem from high-level perspective, perfect RR load-balancer might not be the most effective solution for servers under heavy load, as it adds extra overhead of synchronizing (session) state between worker processes. IMO there should be an option in Node.js core to configure client or session affinity, so that a set of related requests is always handled by the same worker process.

This change should bring two performance-related improvements:

The worker process can use data from in-memory cache for (session) state, reuse the same database connection, work with the "hot" data and code. Even if the data is synchronized with an external store/other processes, the local cached version can be reused for read-only requests.
In certain use-cases (e.g. socket.io), it won't be necessary to synchronize state at all, which should simplify the implementation.

IMO the impact will be even more significant once there will be the option of running cluster workers on multiple machines.

I can imagine a Node.js HTTP server can use a special cookie to track the affinity. I.e. when an HTTP response is returned to a client, a cookie is set with an ID of the worker that handled the request. When the load balancer sees this cookie in an incoming request, it will select the worker using the ID in the cookie and skip RR algorithm.

What are your thoughts on this, @bnoordhuis @piscisaureus @stdarg? If this idea is not dismissed straight away, we should move the discussion somewhere else (I guess a node or nu-js issue would be the best place).

EDIT SockJS has some useful info on load balancing too.

Sticky sessions

If you plan deploying more than one SockJS server, you must make sure that all HTTP requests for a single session will hit the same server. SockJS has two mechanisms that can be useful to achieve that:

Urls are prefixed with server and session id numbers, like: /resource/<server_number>/<session_id>/transport. This is useful for load balancers that support prefix-based affinity (HAProxy does).

JSESSIONID cookie is being set by SockJS-node. Many load balancers turn on sticky sessions if that cookie is set. This technique is derived from Java applications, where sticky sessions are often necessary. HAProxy does support this method, as well as some hosting providers, for example CloudFoundry. In order to enable this method on the client side, please supply a cookie:true option to SockJS constructor.

bajtos commented 11 years ago

Few more thoughts:

There is a new project Primus that seems to get more traction than socket.io these days. We might consider dropping our support of socket.io in favour of Primus.
Primus README explicitly states that you need a load-balancer with sticky sessions (session affinity) to get websockets working in a cluster. If the goal of Node's native cluster is to allow people to replace HAProxy with a pure Node solution, then we have to implement session affinity.

bnoordhuis commented 11 years ago

I can imagine a Node.js HTTP server can use a special cookie to track the affinity. I.e. when an HTTP response is returned to a client, a cookie is set with an ID of the worker that handled the request. When the load balancer sees this cookie in an incoming request, it will select the worker using the ID in the cookie and skip RR algorithm.

It's not an unreasonable suggestion but one potential problem is that the master has to parse and buffer the HTTP headers (in order for the worker to reuse them.)

Parsing is reasonably CPU-intensive though in node.js it's often dwarfed by other things so it might be a wash.

Buffering is more problematic; an evil client could send massive amounts of HTTP headers and trigger a denial of service that way. It's not an intractable problem but there are some caveats to consider.

It also means tighter coupling between the cluster and http modules. That's not the worst thing in the world but it might make future changes more difficult.

sam-github commented 11 years ago

It surprises me that this isn't a more commonly seen problem with socket.io, do most load balancers implement affinity?

Because with socket.io's built-in redis store, propogation would also trigger this bug, causing random auth failures, particularly under load.

I agree that it seems socket.io is losing momentum, in the face of other projects. It might have already dropped off the back of the wave.

I can think of some hack-arounds... such as in the case of auth failure, retry for some time... this would only slow down the client when they are not authorized, which doesn't seem so bad. Its a fix that should go in socket.io, but we could put it in the cluster socket.io store for now, because we have control.

What do you think? Do a few second loop to see if auth will eventually be readable from store, or received via mq? (not sure if its transmitted key/value, or as message).

Its hacky, but random failures under deployment might be worse.

sam-github commented 11 years ago

LGTM

strongloop / strong-cluster-socket.io-store

Partial fix for failing build #4