aws / karpenter-provider-aws

Karpenter is a Kubernetes Node Autoscaler built for flexibility, performance, and simplicity.
https://karpenter.sh
Apache License 2.0
6.73k stars 945 forks source link

Discuss recommending Fargate as the default option for running Karpenter #1812

Closed DWSR closed 2 years ago

DWSR commented 2 years ago

Is an existing page relevant? No

What karpenter features are relevant? Running Karpenter. ;)

How should the docs be improved? Running Karpenter on Fargate neatly solves the "chicken and egg" problem with running Karpenter in a cluster. It avoids the need for a dedicated node group specifically for Karpenter to run on and removes the overhead of managing the lifecycle of that node group.

Because of this, I'd like to discuss making Fargate the default/recommended way to run Karpenter in EKS, rather than the current approach of a dedicated node group.

Thanks!

Community Note

FernandoMiguel commented 2 years ago

we have karpenter running on fargate.... it's the patching of coredns that is our issue

stevehipwell commented 2 years ago

Wouldn't it be better if Karpenter was running in the control plane?

FernandoMiguel commented 2 years ago

Wouldn't it be better if Karpenter was running in the control plane?

@stevehipwell what do you mean?

DWSR commented 2 years ago

@stevehipwell Yes, it would be better if Fargate ran on the AWS-managed control plane, similar to how GKE hosts Cluster-Autoscaler. However that's a lot of internal AWS politics/project planning that the community cannot help with. However, folks are able to install Karpenter on Fargate into existing clusters now.

Edit: Speelling.

stevehipwell commented 2 years ago

It could run on Fargate today, but then you'd be responsible for aggregating your logs and metrics which isn't trivial if you're not all in on CloudWatch. A static sized MNG is probably the safest way to go today.

DWSR commented 2 years ago

@stevehipwell The triviality of observability is highly dependent on your operating context. For example, we're using Datadog, so for us pulling in Karpenter's logs and metrics was straightforward and low effort. If you're using a Prometheus stack, it will probably be more work. However, if you're monitoring the EKS control plane at all, you probably have most/all of the mechanisms in place to monitor Fargate workloads as well.

Prominently calling out the observability requirements for Fargate is a good idea if this suggestion is adopted.

I disagree that a static size MNG is the safest way to go. By using a MNG, you need to:

In contrast, by using a Fargate profile, you need to:

stevehipwell commented 2 years ago

@DWSR we're going to have to agree to disagree on this one. Fargate has a much greater delta from Karpenter than MNGs do. It's also not right sizing for Karpenter, you just run your initial workload on the MNG and Karpenter fills the gap. Unless you're 100% batch I'd expect you to have some static workload defined in combination with Karpenter.

As I stated previously, if you're all in on CloudWatch you might be able to adopt Fargate with minimal extra effort. But if you're actually cloud native you're going to have to do extra work to bring all of your date into a central location with Fargate due to the lack of DaemonSets.

RE MNG day 2 operations, they're currently more advanced than for Karpenter but watch this space for bringing them in line. If you can't do this now I'd suggest moving your workload to ECS.

RE MNG sizing, they're cattle so create a new one if needed; but in reality changing the size of one is a pretty great experience if you really can't make a new one.

github-actions[bot] commented 2 years ago

Labeled for closure due to inactivity in 10 days.

dewjam commented 2 years ago

As mentioned by @chrisnegus , we now have documentation for how to deploy Karpenter to Fargate included in our eksctl getting started guide. It's still in preview if you haven't seen it: https://karpenter.sh/preview/getting-started/getting-started-with-eksctl/

My preference would be to give the option to deploy on MNG or Fargate rather than recommend a specific method of deployment. It seems like there are valid reasons where MNG might be better than Fargate (and vice-versa), so allowing the user to choose seems like the best approach.

DWSR commented 2 years ago

👍 Cool, I'll close the issue then!