Closed rsignell-usgs closed 5 years ago
@jreadey , there is a Helm chart for autoscaling k8s on AWS: https://github.com/kubernetes/charts/tree/master/stable/cluster-autoscaler
How has the scheduled scaling been working for you?
It seems quite wasteful. Are there reasons not to implement cluster autoscaling?
@jreadey , have you tried following the instructions here: https://akomljen.com/kubernetes-cluster-autoscaling-on-aws/
@jacobtomlinson, do those look reasonable?
I tried, and didn't even get out of the starting gate:
(IOOS3) rsignell@gamone:~> kops edit ig nodes
Using cluster from kubectl context: kubernetes
State Store: Required value: Please set the --state flag or export KOPS_STATE_STORE.
A valid value follows the format s3://<bucket>.
A s3 bucket is required to store cluster state information.
@jreadey, do you know the S3 location of the kops state store?
Yup looks totally reasonable to me!
@jreadey , nevermind. I found it. https://s3-us-west-2.amazonaws.com/hdflab-kubernetes-k8sstack-sflx-clusterinfobucket-11famro67p10g/cluster-info.yaml
@jreadey , hmm, maybe not:
rsignell@gamone:~> export KOPS_STATE_STORE=s3://hdflab-kubernetes-k8sstack-sflx-clusterinfobucket-1
1famro67p10g
rsignell@gamone:~> kops get clusters
error reading state store: Could not retrieve location for AWS bucket hdflab-kubernetes-k8sstack-sflx-clusterinfobucket-11famro67p10g
Now that the kops cluster is working, I followed these instructions: https://akomljen.com/kubernetes-cluster-autoscaling-on-aws/ with these settings:
helm install --name autoscaler \
--namespace kube-system \
--set image.tag=v1.2.1 \
--set autoDiscovery.clusterName=kopscluster.k8s.local \
--set extraArgs.balance-similar-node-groups=false \
--set extraArgs.expander=random \
--set rbac.create=true \
--set rbac.pspEnabled=true \
--set awsRegion=us-east-1 \
--set nodeSelector."node-role\.kubernetes\.io/master"="" \
--set tolerations[0].effect=NoSchedule \
--set tolerations[0].key=node-role.kubernetes.io/master \
--set cloudProvider=aws \
stable/cluster-autoscaler
Seems like it's running anyway:
$ kubectl get pods -l "app=aws-cluster-autoscaler" -n kube-system
NAME READY STATUS RESTARTS AGE
autoscaler-aws-cluster-autoscaler-7c494d8cbb-nvp58 1/1 Running 0 3m
Autoscaling is working!!!
Right now our Kubernetes nodes are always running, regardless if anyone is using them.
We really need to implement the kubernetes Cluster Autoscaler on AWS
I think we should try to implement this ASAP.
@jreadey, does this make sense to you?
Or do we need to try to get help?