Open osterman opened 6 years ago
@Nuru can you help me understand the kinds of autoscaling you have in mind?
For example:
Kubernetes provides the ability to autoscale certain kinds of resources (e.g. the number Pods
in a ReplicaSet
) as well as the ability to scale out the number of Nodes
in the cluster. Kubernetes supports both, however, generally scaling out Nodes
is a slower operation, while scaling out Pods
can be nearly instantaneous. Out-of-the-box, kubernetes can scale on the standard kinds of metrics (CPU, Memory), but it's possible to do custom instrumentation.
Autoscaling Pods
is the easiest way. Basically, a small app is written up and deployed as a controller in the cluster. This can be as simple as a bash script or as complicated as a go app.
Essentially, they automate the the following command (which can also be run manually):
kubectl scale -n $namespace --replicas=$desiredPods deployment/$deployment
@osterman The above is a great start on the documentation I am asking for.
I am mainly asking specifically how we would connect some metric we generate to our specific Kubernetes control plane to scale pods and nodes up and down in a way that coordinates with other scaling signals (rather than overrides them).
For example, let's say that we decide we are going to have a pod that holds long-lived TCP connections to users and we want to limit the number of open sockets in the pod to 10,000 and the number of open sockets in the node to 50,000. Assume we have a shell script sockcount
that returns the number of open sockets as reported by the kernel it is running on.
Questions I have include:
<path-to-kubeconfig>
and <ip-address-of-apiserver>
where do we find the values to substitute in?
what
See #20 (asked by @Nuru)