Concerns about recommending `cluster` over existing solutions

jonjamz commented 9 years ago

I think this is a great concept, especially because it's so easy to implement. But I'm wondering if it should be recommended for serious production use. If you would make that recommendation, it would be nice to see a comparison between cluster and the other solutions in the docs--

The docs mention that using cluster is as easy as adding a Meteor package. So I'm guessing that it puts the request handling and load balancing on Meteor applications and uses the application's database? If not, it's a bit more complicated.
There are good existing solutions that auto-scale load balancers, and I know you're aware of them--what are they, and why is cluster better for Meteor apps?
If DDP connections are dropped as a load balancer is manually scaled (for the very short period of time it takes to restart the load balancer), as long as there are sticky sessions, is that a problem? What happens?
The load balancers in question, like HAProxy, are battle-tested in production and have been under development for years. HAProxy supports SSL, HTTP compression, health checks, redispatching, and has a very useful ACL feature, and supports complex redirects, among other features. It's written in C, and can handle upwards to 40K hits per second. At what point does someone decide to use cluster instead of HAProxy, or vice-versa?

These points aren't meant to criticize your package--as I said, it's really cool. And of course, running a real-time Meteor app is much different than running an HTTP request/response-style app.

andrewreedy commented 9 years ago

Also, with HAProxy and many other tested cluster/cloud management solutions there are ways of adding nodes with zero downtime. Is this the main problem Cluster is trying to solve?

MaazAli commented 9 years ago

Additionally, will this work for a multi-core server. Potentially, launching multiple node instances and load balancing those?

arunoda commented 9 years ago

Alright. First I need to give an overview of what this does. I'm not against of all other services and this is a DDP load balancer which does more work. I haven't done a performance test yet and I'll do it soon.

Let me discuss few of them.

1) It's uses the database as the service discovery solution. It's for the purpose of consule, etcd. In this case, we did it with mongodb for the ease of use. But it's easy to implement with etcd and so on. So, database will be used to communicate new servers to the cluster. There is a simple interface for that and it can be implemented for other systems as well.

2) There is a few solution and they are based on etcd or consule for service discovery. I see couple of them uses haproxy and some used go based solutions. So basically, these discovery services helps to override the routing table. But this is a subject which is evolving and there are no perfect solution (specially for meteor since we talk about WebSockets and with sticky sessions)

3) It not a problem if you look at once. But if you simply add a route to any of the load balancers(nginx and haproxy) they need to reset the websocket connection. Zero downtime is for HTTP requests (there are some complex modes and that cause stability modes with WebSockets since it's a long lasting solution). So, once the restarting is happened, it will send trigger all of your client will start to subscribe again and that's a DDOS against your app :) Have you seen any high CPU spikes during when you deploy an app, this is the same reason.

4) I agree. We use node http-proxy behind the scene and it's battle tested as well. Since, we are talking about realtime connections, this is a very different take on load balancing.

Now, let me tell about cluster and how it's different and why you choose it over others.

1) Cluster makes any node in your cluster as a load balancer. That means the multi master mode. (Just like cassandra does). Additionally you can add separate load balancers to process DDP as well. We call them as balancers. You can add/remove balancers without affecting other part of the cluster.

2) This does very low level DDP proxying. In big apps, scaling is not about tools and it's about techniques. Cluster will gives that to meteor. For an example, once you the app, cluster will ask the browser to connect to a very different node in your cluster to use for ddp. That means, we distribute the load DDP load over all the instances. This is a technique, big apps do to scale. It's about multiple entries.

3) With cluster, you can do resource based distribution. Which means, you can scale based on CPU Usage, Memory usage or throughput. So because of that, you can add servers from different cloud services and use them together. This is something you can't do easily with other tools. (This is not yet implemented, but you can implement it with basic javascript)

4) You can run A/B tests very easily. You can change the routing rules dynamically and which is very hard with other services. (which you can implement manually)

5) You can navigate users do the nearest server for them. (which you can implement manually)

6) You can run services like rate limiting on the DDP level.

MicroServices and Service Discovery

All the above, we've talked about load balancing stuff. But the biggest feature of this is service discovery. Let me talk a bit about this. In the future, we will write our apps as a lot of microservices. That means we will separate services for

search
web
do aggregations
background jobs
handle traffic from other API

We can use multiple Meteor app(not necessarily with current Meteor, but in Go, Java and etc). Each of them communicate via DDP. So basically, once these services started they can register them selves to the cluster.

For an example, search app does this.

Cluster.register('search')

And now from any part of your cluster, you can get a DDP connection to the above server with.

var conn = Cluster.discoverConnection('search')

You don't need to worry about IP address and host names. You can have multiple instance of search nodes. So, you can scale the resources you need most. Cluster will take care of the communication between them.

And You can discover ddp connection is both in the client and the server using the above discoverConnection API. Yes, we've authentication support.

Using with existing tools

One thing we don't do is SSL support.(and we can't. Node is bad at SSL) We can nginx, stud or bud for that.

So basically, this is a DDP clustering solution build for MicroServices and not just a load balancer :) I will reveal a lot of more info in the webinar :)

jonjamz commented 9 years ago

:+1:

fix commented 9 years ago

:point_up_2:

Sewdn commented 9 years ago

great response! very elaborate and helpful... You rock!

arunoda commented 9 years ago

Thanks.

On Sun Feb 15 2015 at 4:02:42 AM Pieter Soudan notifications@github.com wrote:

great response! very elaborate and helpful... You rock!

— Reply to this email directly or view it on GitHub https://github.com/meteorhacks/cluster/issues/4#issuecomment-74394865.

meteorhacks / cluster