GSA / data.gov

Main repository for the data.gov service
https://data.gov
Other
626 stars 99 forks source link

Setup auto-scaling for Managed Node Groups (MNGs) #3669

Closed mogul closed 2 years ago

mogul commented 2 years ago

User Story

In order to avoid running out of compute resources underlying brokered SOLR instances, data.gov wants EC2 instances to be added on an as-needed basis to handle EKS workloads.

Acceptance Criteria

[ACs should be clearly demoable/verifiable whenever possible. Try specifying them using BDD.]

Background

AWS Managed Node Groups have autoscaling capabilitied built-in by default. However, for kubernetes to utlize the AWS capability, it needs to deploy an autoscaling mechanism that is aware of the AWS infrastructure. (The EKS module we're using also has an example of configuring cluster autoscaling.)

Security Considerations (required)

[Any security concerns that might be implicated in the change. "None" is OK, just be explicit here!] None.

Sketch

mogul commented 2 years ago

I stumbled on a recent blog post describing how to use Karpenter with spot instances.

nickumia-reisys commented 2 years ago

It's working!! A node randomly appears!

image image

nickumia-reisys commented 2 years ago

This seems exceedingly unhelpful 😞 It started another node, but the first node never became ready... Lots of people were experiencing the issue here , but alas, no resolution. Will keep looking.

image

nickumia-reisys commented 2 years ago

So happy 😭

image

nickumia-reisys commented 2 years ago

🥲 image

nickumia-reisys commented 2 years ago

This is a problem that was hit.. not sure a work-around right now. But it might not be super important..