dandi / dandi-hub

Infrastructure and code for the dandihub
https://hub.dandiarchive.org
Other
10 stars 23 forks source link

Minimum node count in EKS clusters in production does not reflect Terraform value set in do_eks setup #160

Closed aaronkanzer closed 1 month ago

aaronkanzer commented 3 months ago

Cc @asmacdo @kabilar @satra

In our Terraform chart for do_eks, the minimum default node count for the group is set to 1 (see here: https://github.com/dandi/dandi-hub/blob/do-eks/main.tf#L103-L105) ; however, in practice, it seems it remains at 4

NAME                                           STATUS   ROLES    AGE     VERSION
ip-100-64-108-217.us-east-2.compute.internal   Ready    <none>   2d23h   v1.27.12-eks-ae9a62a
ip-100-64-113-247.us-east-2.compute.internal   Ready    <none>   2d23h   v1.27.12-eks-ae9a62a
ip-100-64-132-251.us-east-2.compute.internal   Ready    <none>   2d23h   v1.27.12-eks-ae9a62a
ip-100-64-224-60.us-east-2.compute.internal    Ready    <none>   2d23h   v1.27.12-eks-ae9a62a

This ticket's purpose is to 1. investigate why, and 2. perhaps lower the minimum count for cost purposes

kabilar commented 3 months ago

https://github.com/awslabs/data-on-eks/issues/556

asmacdo commented 1 month ago

From the upstream issue, it looks like desired nodes doesn't have much of an effect. Min nodes didn't seem to help much when set after the fact, but the new LINC deployment is down to a single node

asmacdo commented 1 month ago

Estimated $72/month per deployment saved