aws-samples / eks-cluster-upgrade

Automated Amazon EKS cluster upgrade
MIT No Attribution
154 stars 34 forks source link

Bug: Error occurred while upgrading Node Group in EC2 instance created through Karpenter #125

Closed namejsjeongkr closed 9 months ago

namejsjeongkr commented 1 year ago

Expected Behaviour

I tried to upgrade via eksupgrade CLI from 1.25 to 1.26.

Unfortunately, I've got an error about checking node group stage. I created EKS Cluster via Terraform and there's no issues about it.

And the instance where the error occurred is an instance created through Karpenter.

As far as I know, Karpenter doesn't generate nodes via ASG, do I need an ASG name to use the eksupgrade CLI ?

Please let me know what's the best way to upgrade EKS Cluster that in using Karpenter via eksupgrade CLI.

Current Behaviour

스크린샷 2023-05-31 17 47 42

Code snippet

eksupgrade --force <CLUSTER_NAME> 1.26 ap-northeast-2

Possible Solution

No response

Steps to Reproduce

eksupgrade --force 1.26 ap-northeast-2

Amazon EKS upgrade version

1.26

Python runtime version

3.11

Packaging format used

PyPi

Debugging logs

i-0f8bafc14f3d4bc9c cannot be upgraded because the cluster version is not compatible with the node version
Error occurred while checking node group details - Error: cannot access local variable 'autoscale_group_name' where it is not associated with a value

Post flight unsuccessful because of the following errors: ["Error occurred while checking node group details cannot access local variable 'autoscale_group_name' where it is not associated with a value"]
Post flight check for cluster <CLUSTER_NAME> failed after it upgraded
github-actions[bot] commented 1 year ago

This issue has not received any attention in 30 days. If you want to keep this issue open, please leave a comment below and auto-close will be canceled.

namejsjeongkr commented 1 year ago

I'm looking forward to your reply

bryantbiggs commented 1 year ago

where does the Karpenter controller pods run in your cluster, are they deployed on EKS Fargate?

namejsjeongkr commented 1 year ago

Thanks for replying my case. No. I deployed on EKS EC2 Instances.

bryantbiggs commented 1 year ago

Ok, and how did that instance get created

namejsjeongkr commented 1 year ago

@bryantbiggs When I created an EKS cluster, I provisioned two Managed NodeGroups. Nodes that are subsequently created are set up to be provisioned through Karpenter.

So, When I tried to upgrade EKS Cluster, there was a mix of instances created through Managed NodeGroup and Karpenter.

bryantbiggs commented 1 year ago

ah ok - now I think I (might) know the issue. eksupgrade is looking for the parent ASG of each instance but it won't find any for Karpenter nodes, got it!

namejsjeongkr commented 1 year ago

ah ok - now I think I (might) know the issue. eksupgrade is looking for the parent ASG of each instance but it won't find any for Karpenter nodes, got it!

I'm curious about it.. As far as I know, Karpenter is developed by AWS. Why didn't you consider this function ? You know, Karpenter doesn't use ASG(Auto Scaling Group) to provision EC2 instances.

bryantbiggs commented 1 year ago

This project was developed by previous colleagues long before Karpenter existed so I suspect that plays a large factor

namejsjeongkr commented 1 year ago

@bryantbiggs, I appreciate your reply. I totally understood what you're saying and I'm looking forward to develop feature to upgrade smoothly Karpenter nodes.

github-actions[bot] commented 11 months ago

This issue has not received any attention in 30 days. If you want to keep this issue open, please leave a comment below and auto-close will be canceled.

namejsjeongkr commented 11 months ago

@bryantbiggs, Is there anything to update for me ? If you're too busy to develop this feature, I could help you !

github-actions[bot] commented 10 months ago

This issue has not received any attention in 30 days. If you want to keep this issue open, please leave a comment below and auto-close will be canceled.

namejsjeongkr commented 10 months ago

I'll develop the function and post Pull Request.

github-actions[bot] commented 9 months ago

This issue has not received any attention in 30 days. If you want to keep this issue open, please leave a comment below and auto-close will be canceled.