Australian-Imaging-Service / charts

Apache License 2.0
3 stars 18 forks source link

Horizontal Autoscaling Doesn't work #41

Closed exxa-tech closed 3 years ago

exxa-tech commented 3 years ago

I did a LOT of testing to find out the issue here. Anyway, there are a lot of moving parts here.

  1. Metrics Server needs to be enabled.
  2. cluster-autoscaling needs to be enabled at the Cluster level to autoscale EC2 nodes. https://docs.aws.amazon.com/eks/latest/userguide/cluster-autoscaler.html
  3. Storageclass needs to be changed for default (gp2) and in my case custom-class which I used for ObjectiveFS to mount from: volumeBindingMode: Immediate to: volumeBindingMode: WaitForFirstConsumer

This is to allow multizone autoscaling for multi-AZ deployments (pretty essential) - read more: https://aws.amazon.com/blogs/containers/amazon-eks-cluster-multi-zone-auto-scaling-groups

  1. Resources need to be specified in values file (example - will tweak):

    resources: limits: cpu: 200m memory: 1024Mi requests: cpu: 200m memory: 1024Mi

  2. HPA needs to be specified in values file:

    autoscaling: enabled: true minReplicas: 2 maxReplicas: 100 targetCPUUtilizationPercentage: 50 targetMemoryUtilizationPercentage: 80

...And it doesn't work.

horizontalpodautoscaler.autoscaling/xnat-xnat-web Deployment/xnat-xnat-web /80%, /50% 1 100 0 88m

k -nxnat get deployment xnat-xnat-web Error from server (NotFound): deployments.apps "xnat-xnat-web" not found

After looking at the hpa.yaml in the templates directory for xnat-web it becomes clear - It specifies:

kind: Deployment

There is no deployment called xnat-xnat-web.

Therefore this needs to be changed to: kind: StatefulSet

I have tested by creating an hpa.yaml and applying it directly and it works - both at the pod (HPA) and node (cluster-autoscale) level.

Dean - please give me the go ahead to change this one line of code. Thanks!

dean-taylor commented 3 years ago

Alastair, happy for this change to be committed to the main repo. Please ensure that the patch number for the umbrella chart has been modified to ensure a new release is created.

exxa-tech commented 3 years ago

Done and done. Chart release 0.4.4, Helm chart successfully pushed. Closing issue.