eksctl-io / eksctl

The official CLI for Amazon EKS
https://eksctl.io
Other
4.92k stars 1.41k forks source link

eksctl apply: cluster reconciliation #2774

Open michaelbeaumont opened 3 years ago

michaelbeaumont commented 3 years ago

Why do you want this feature? This is the umbrella issue concerning eksctl apply, which would support eksctl apply -f config.yaml to reconcile the current cluster with the given config (initially partially). Also sometimes known as eksctl update.

Discussion

Give us your ideas and use cases in the github discussion!

Related issues:

1497

462 (previous umbrella issue, closed because the history is confusing)

20

https://github.com/weaveworks/eksctl/issues/583 (discussion around storing metadata)

Callisto13 commented 3 years ago

🎉

This issue is going to track the delivery of a proposal for how we will implement gitops-style reconciliation with eksctl.

Deliverable is a proposal in docs/ which details:

  1. The end state of eksctl apply (ie. ignoring everything else which currently exists, what would it look like if we could apply today?)
  2. What the rest of the eksctl UX looks like in a world with apply (ie. which flags, subcommands, etc would survive?)
  3. An implementation strategy: how we adapt the code to get from now to apply (design diagrams may be handy)
  4. A consideration of the risks and anything which may stop us from truly representing a cluster in config

Doing gitops reconciliation has been the goal of this project for a very long time, but the rollout was so gradual that the Plan was eventually forgotten as people moved off the project. We need to move incrementally, but let's aim to not be too incremental so that we forget the big picture.

Please add comments on what you would like to see in a proposal doc for this work, which questions you would like answered, etc

Note: no need to start discussing ideas right here right now, save it for the proposal. This is just to sign off on what we want covered in a proposal doc

aclevername commented 3 years ago

Look good to me, I think something that might be worth adding to the proposal is how to handle existing clusters that have been created from a config file, but then updated imperatively though commands like eksctl create nodegroup and eksctl upgrade cluster etc.

Callisto13 commented 3 years ago

excellent point!

artem-nefedov commented 3 years ago

Absolutely would love to see this. I would argue that this is the most important feature moving forward, because the main point of eksctl is that it simplifies the management of EKS clusters. However, we had to implement multiple workarounds and hacks already just because eksctl upgrade cluster -f config.yaml or eksctl upgrade nodegroup -f config.yaml commands don't want to honor changes made to config manifest, which kinda undermines the whole simplification part, as those hacks would not be necessary if something like terraform was used instead.

michaelbeaumont commented 3 years ago

To clarify for 2.), I don't see the alteration of any existing commands as in scope for this issue. The goal would be to make them unnecessary when using apply of course.

Callisto13 commented 3 years ago

👍 totally. I didn't mean "alteration", more "termination" 😈

michaelbeaumont commented 3 years ago

I think 3) is sort of dependent on 1) being "finalized", right? That is, I don't think a proposal for the behavior of apply should need to answer implementation questions, although we should consider 4).

Callisto13 commented 3 years ago

🤔 Isn't explaining how we will do the thing one of the core parts of a design proposal? It's not like we can start work without it/a bunch of tickets

I mean, yes one does depend on the other, but they can be presented and discussed at the same time.

michaelbeaumont commented 3 years ago

I would have thought the first step/proposal is something users might engage with. It would answer the question "how does eksctl apply behave with this config?".

EDIT: it would also answer questions around incrementally introducing support/changing configs, etc., all of the user visible changes.

Callisto13 commented 3 years ago

A complete proposal should cover all stages, all questions, and all risks, but we could get away with not providing them all at once? The danger (not really danger, inconvenience) is that sometimes risks and questions discovered later can end up influencing the behaviour, so we would have to circle back

Either way I am not bothered if we take each step separately so long as all information is covered before we start work.

github-actions[bot] commented 3 years ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

bitva77 commented 3 years ago

Just want to say that if this feature means I can stop gitops'ing eksctl with bash scripts, this would be such a win.

/going back into my hole now

nickjj commented 3 years ago

I'd really like to see this too.

Use case: I wanted to attach a new iam addon policy to an existing cluster. I ended up doing:

  1. Update my cluster config file with the new addon
  2. Create a new node group with a new name (arbitrarily renamed it by adding -abc to the end of the name)
  3. Delete the old node group

This worked but it was a pretty big ordeal that took over 8 minutes while waiting for AWS and also performing destructive actions. This all boils down to adding a policy to a group of nodes right? Would be great if you could run 1 command that applies the new policies to your existing nodes.

kahirokunn commented 3 years ago

Is there any progress?

aclevername commented 3 years ago

Hi @kahirokunn not currently. We are still considering what the big-next steps should be for eksctl.

steffakasid commented 2 years ago

This would really be the killer feature to be able to use eksctl e.g. in pipelines to roll-out clusters & cluster updates.

EliMor commented 2 years ago

Personal opinion, without this feature I'm not comfortable using eksctl in production. Very cool project though, I hope it keeps going.

bitva77 commented 2 years ago

Things have improved in this area. We use eksctl in Production just fine. I mean, how often are you changing EKS control plane configuration? Like never. Node groups are the only thing we change and I'm not sure I want node groups to be deleted automatically yet...especially in Production.

Skarlso commented 2 years ago

Very cool project though, I hope it keeps going.

That's a bit passive-aggressive. :) It was created in 2018 and has been going strong for a while now, without any indication of stopping. :) And it just gets better and better.

While apply IS a killer feature, you can combine eksctl with flux easily to support this workflow by eksctl creating the necessary files and flux managing them. eksctl even sports a flux integration itself. Or you can use eksctl together with terraform to achieve a declarative description of your infrastructure.

We might eventually support apply, but there are many features that have precedence. Like, as supporting Karpenter out of the box, for example, allowing users to explore and use Karpenter in a friendly and easy way. :)

artem-nefedov commented 2 years ago

There are a lot of things that can be safely changed in-place, but you have to perform a separate command for each of them, which is extremely annoying and un-declarative.

These include (but not limited to):

If eksctl apply could handle just those "safe" cases, that would already be wonderful.

Skarlso commented 2 years ago

And we graciously accept community contributions for these features and will happily help get the PR ready and merged! :)

rverma-dev commented 2 years ago

One thing I want to suggest is that eksctl apply doesn't need to be a functionality within the eksctl only, even if there is a GitHub action which can simulate reconciliation, that is also good enough.

All we need to do is just place the right group of commands and hooks. e.g. Even if we don't have inplace node reconcillation, this can be done using

  1. Mandate an unique node group name (may include version or date in the name)
  2. Create and remove orphan node groups

However the most challenging part of the scripts would be recovering from errors and achieve Idempotency. If someone is already have these in bits and pieces as a community we can convert this to a GitHub action. Those who are new like me are actually looking for just idea on how to automate the stuff. Liked @Skarlso comment but no idea how to implement something like that.

mhemken-vts commented 1 year ago

Hold on... there's no such thing as apply? Am I having a case of the Mandela Effect? I swear I've used eksctl apply before.

How was this not a thing from the beginning?

mhemken-vts commented 1 year ago

We might eventually support apply, but there are many features that have precedence. Like, as supporting Karpenter out of the box, for example, allowing users to explore and use Karpenter in a friendly and easy way. :)

How does Karpenter have precedence over apply? Isn't the whole point of declarative infra that you can do idempotent applys repeatedly? I thought that was the whole point of this tool.

For those using this tool in production, what is the normal development cycle? Do you just create a new cluster when the old one's configuration inevitably becomes outdated?

bitva77 commented 1 year ago

I think you're confusing eksctl with terraform

matti commented 1 year ago

@mhemken-vts I'm creating a new cluster and new nodegroups when I need to change something. It's horrible, I know.

However, apply would be great only for cluster.

I honestly think its better to create new nodegroups and then delete old ones as if your change is problematic, your old nodesgroups are still working.