aws-quickstart / cdk-eks-blueprints

AWS Quick Start Team
Apache License 2.0
446 stars 198 forks source link

(cluster-providers): Updating node group fails #484

Open pflorek opened 2 years ago

pflorek commented 2 years ago

Describe the bug

When trying to update MngClusterProvider's NodeGroup i.e. changing the capacity type for different deployment targets it's fails:

Resource handler returned message: "NodeGroup already exists with name Cluster and cluster name Cluster

The reason is that CDK/CFN needs to execute an update replace of the NodeGroup. Doing so it tries first to create a new NodeGroup and then to delete the old one. By creating the new one CFN get's a naming conflict

Expected Behavior

Changes to the node group will just be executed and by default CDK/CFN is managing the resource ids and names.

Current Behavior

cdk-eks-blueprints is managing the nodegroup's id/name and forces the user to update/alternate the nodegroup's id/name

Reproduction Steps

Create a MngClusterProvider deploy, change capacity type, deploy -> failed stack

Possible Solution

Avoid setting the NodeGroupName and let CDK/CFN manage it by default

lib/cluster-providers/generic-cluster-provider.ts

                nodegroupName: nodeGroup.nodegroupName ?? nodeGroup.id,

Let the user decide on his own, if he wishes to override

Additional Information/Context

No response

CDK CLI Version

2.37.1

EKS Blueprints Version

1.2.0

Node.js Version

18.7.0

Environment details (OS name and version, etc.)

MacOS

Other information

No response

shapirov103 commented 2 years ago

Capacity type change and a few other changes to the nodegroup are not allowed in update mode, so the behavior you outlined is correct, albeit inconvenient and forces customers to supply an alternate nodegroup name, which is a workaround for this issue. The recommendation to not supply the name may be reasonable, although I recall the change to supply a meaningful nodegroup name was in response to a github issue as well. We will assess.

pflorek commented 2 years ago

I agree with the previous issue, the generated names aren't that nice. Also it's really nice to have constructs such as MngClusterProvider. For a quick start they should stay opionated. Maybe the GenericClusterProvider stays generic alone?

elamaran11 commented 11 months ago

@pflorek Please let us know if this issue can be closed?

mahdi-torabi commented 6 months ago

This issue is impacting us as well. Is it possible to have a consistent name generated based on node group configurations and have it changed when certain parameters change?

e.g At the moment when nodegroupName is not specified, the code defaults to eks-blueprints-mng. This causes a clash when certain changes trigger a replace operation!

An alternative would be to generate a hash from values of parameters that trigger a replacement (e.g diskSize, InstanceType etc) and append it to the automatically generated name i.e eks-blueprints-mng-23a2d43...

This way when any of those parameters change, we automatically generate a new name.