calzoneman / sync

Node.JS Server and JavaScript/HTML Client for synchronizing online media
Other
1.46k stars 235 forks source link

Multithreading/multi-core support for larger sites #325

Open Kirtaner opened 10 years ago

Kirtaner commented 10 years ago

We've managed to start hitting the upper limit of a single core, approximately 2000 simultaneous channel connections. CPU load uses nearly an entire core on a Xeon E5520.

A configuration option for spawning X number of child processes to go along with multithreading support would be superb. I've started looking into how much work this would entail, the only thing to figure out is the best way to handle process intercommunication; child processes currently can't share channel or chat states.

nuclearace commented 10 years ago

Can node.js even do multithreading?

Kirtaner commented 10 years ago

Yes, http://nodejs.org/docs/latest/api/cluster.html

Kirtaner commented 10 years ago

Redis or MongoDB for chat/channel states to be broadcast to all child workers. This can probably be extended to multi-server sharding. That's one idea.

calzoneman commented 10 years ago

@nuclearace No, node doesn't support multithreading. If it did, it would be pretty easy to make the server multithreaded (as opposed to the cluster module which is based on process forking).

This is a cool idea, and I have actually debated pursuing an IRC-like model in the past, but I simply don't have the time to investigate this large of a change at the moment. Recently my cytube efforts have been focused on the 3.0 release which involves a lot of UI changes as well as some serverside refactoring and restructuring. It might be easier to build off of that release in the future, but between that, other projects, school, and research I can't really dedicate time to such a large and fundamental change in the overall design.

Of course if you have someone up to the task they are more than welcome to take a whack at it (although node's page for the cluster module has its stability level as "1- experimental" so tread carefully).

Kirtaner commented 10 years ago

I think the first step would be enabling a nosql server in place of the current channel flat file dumps, I'll take a stab at it.

Is 3.0 functional enough to set up a working development server?

calzoneman commented 10 years ago

I think that's the easy part, the hard part is communicating between processes and maintaining the central state without running into concurrent modification, deadlocks, etc.

3.0 mostly works, but there's enough things I still intend to change that I wouldn't really advise using it unless you're willing to deal with future merge conflicts.

Kirtaner commented 10 years ago

Concurrency would be supplied via pubsub http://redis.io/topics/pubsub, actually

calzoneman commented 10 years ago

In the meantime, if you need a short term way to reduce processor usage, you might try profiling it with https://github.com/sidorares/node-tick and trying to cut cycles in hot code paths.

calzoneman commented 10 years ago

I've been thinking of ideas recently for how to easily shard the work across processes and across servers. This isn't a dead feature, it just will require some time for me to continue thinking about the easiest and best way to implement certain things.

Mewte commented 10 years ago

I've done it with the redis store and the cluster module, but after a week I undid all my changes. I experienced a lot of memory leaks with the redis store and minimal performance boost. I'll be attempting to do it again soon, but after I switch from socket.io to something more flexible like primus.io so I'm not locked in to just one module. The key aspect would be splitting your application logic into smaller scalable pieces. I'm finding that to be the most difficult part. Also redis pub/sub should be useful too for maintaining system state among all the seperate processes and servers.

Xaekai commented 8 years ago

This is almost ripe for closing. We just need to document the distribution config.

calzoneman commented 8 years ago

I wouldn't quite say so. There's still a bit of work I'd like to do for partitioning before I add any more shards, and that still only resolves the issue of many channels. There is still the project of splitting the socket.io termination from the channel backend (which is getting a bit dusty on the shelf, but I may pick it up again soon).