Instructions on scaling

wandenberg / nginx-push-stream-module

A pure stream http push technology for your Nginx setup. Comet made easy and really scalable.

Other

2.22k stars 295 forks source link

Instructions on scaling #197

Closed msurguy closed 8 years ago

msurguy commented 9 years ago

Hi! This is a really great module and I appreciate your work on this project. I will be using it for a live blogging system soon.

Are there any instructions on how to scale this module to more than one server or any information on how to optimize performance of the module, which problems might arise?

I would really appreciate any pointers that could be added to project's Readme.

Thanks!

misiek08 commented 8 years ago

What you will be using this module exactly for? Chat, adding posts and/or other?

msurguy commented 8 years ago

@misiek08 ideally it will be used for adding micro blog posts and the problem is that there could be quite a few live events going on the platform, potentially hitting 100,000 users online.

misiek08 commented 8 years ago

2-3 light servers with module. For hosting websotes you need more servers of course. Handle sending posts via AJAX calls, so backend will be written in PHP, Ruby, Python or node.js. Then just send message to those 2-3 server with module. If you don't belive doing it my way just read Disqus use case on HighScalability. I done it same way few months before disqus and it worked better than you can imagine.

msurguy commented 8 years ago

@misiek08 I totally believe in what you have said, my only question was how do you handle the distribution on the client side, do you just generate a random number in JS (let's say an index of an array that has all 2-3 servers domain names / IPs) and connect to the randomly chosen push server? Or is that process somehow masked by another server so that the client always connects to one IP/domain and the routing happens on the server side?

misiek08 commented 8 years ago

One way is just connect to random server from array (with failover retry using that array). Second client is using method, where backend serving HTML is giving array with sorted backend servers by load (first is least loaded server) and client connect to first, if fail then 2nd, etc.

msurguy commented 8 years ago

@misiek08 the second alternative is interesting, I'd appreciate if you could elaborate on that. So I'd need to build a service that checks the server load somehow and returns an ip/domain of the least busy server? Are there any tools you would recommend for checking the status of the server load?

Thanks so much for keeping this discussion going!

misiek08 commented 8 years ago

Sorry, just got email for this topic.

I have just counters with connected users to each server and types of them. I try to connect all types of clients separately to not mix WS's and long-polling. It's efficent in my use case. I have redis database for counters and every server have separate key and hash with information about load. Using this data I'm able to return client enother endpoint. I'm now testing solution with home-made proxy in C which is connected to redis and proxy connections from user to least loaded system. Now I made some optimizations and I'm able to handle all my client's traffic on single servers, but for testing and as concept-proof I'm doing this proxy and other load balancing stuff (mostly in Lua on openresty).