moby / swarmkit

A toolkit for orchestrating distributed systems at any scale. It includes primitives for node discovery, raft-based consensus, task scheduling and more.
Apache License 2.0
3.34k stars 612 forks source link

Proposal: elastic autoscaling app & dynamic routing standatd #1663

Open bhajian opened 7 years ago

bhajian commented 7 years ago

I have been working with docker SWARM lately and noticed there is a lack of features such as elastic scaling and dynamic routing standard. The gap is the SWARM manager does not automatically scale up/down the app regarding the stat data such as CPU/Heap or traffic stats. Another issue is there is no standard way to configure load balancer (Nginx or Apache, HAProxy) with the latest info about the worker nodes. My proposal/suggestion is:

stevvooe commented 7 years ago

@bhajian I'm not sure what you mean by "dynamic routing standard". There is already L3/4 load balancing built in that tracks task location to route connections. You can use the service name as a dns name to leverage this in your application.

In general, we are doing working to collect metrics for both swarm engine nodes and possibly application metrics. There is a possibility to use these as feedback for auto-scaling decisions, but it usually requires application-specific logic to make scaling decisions. It's likely that we'd build in support for hooks to scale based on user logic, rather than build in auto-scaling directly.

For an example of an L7 load balancer, see https://github.com/stevvooe/sillyproxy. It demonstrates scalable L7 load balancing on top of swarm primitives.

All in all, this doesn't seem complete enough to be called a "Proposal". Do you mind if I mark this as a feature request?

bhajian commented 7 years ago

@stevvooe Thanks for your prompt response. Yes please feel free to mark it as future request. I agree that the embedded load balancer is very useful in many cases. However, there are some cases that we shall use external load balancer such as Nginx. By the L3/4 do you mean using of overlay network for load balancing? For example: we may want to monitor the health of each application (but not the container) by a heartbeat that sends a request to the web app and assumes the app is healthy if the response is good, we update the Nginx routing table unless otherwise. I agree this is application specific and may not be the case to be embedded in the SWARM. Autoscaling would be a nice future work for SWARM kit that brings elasticity to cluster which could be tagged as future work.

stevvooe commented 7 years ago

@bhajian An L3/4 load balancer operates on TCP/UDP flows, as opposed to application layer (L7). In practice, this means it will load balance tcp connections, rather than per http request.

You can seed the nginx upstream with the <service>.tasks via DNS with the service in the DNSRR endpoint mode.

bhajian commented 7 years ago

@stevvooe Thanks Steve, Yeah, I kind of did the same thing.

stevvooe commented 7 years ago

@bhajian Also, we are working on more direct access primitives with https://github.com/docker/swarmkit/pull/1645 and DNS mapping in https://github.com/docker/swarmkit/issues/1242.

markvr commented 7 years ago

https://github.com/docker/dockercloud-haproxy is a L7 load balancer that automatically reconfigures as services change. Given this is written by Docker, I'm surprised they don't promote it more - swarm would be usuable for me without this. Despite the name (i.e. dockercloud), it works fine with swarm as well.

endeepak commented 6 years ago

We needed autoscaling based on metrics stored in prometheus. We built a service which uses docker API to scale based on metrics from prometheus https://github.com/sahajsoft/docker-swarm-service-autoscaler