Changing the instanceType in an existing NodeGroup results in a 400 failure to delete the LaunchConfig since it is attached to an active ASG

metral commented 5 years ago

When a NodeGroup is stood up with a given instance type, e.g. t2.medium, and then on a future update is changed to say t3.large, results in the following error:

Diagnostics:
  aws:ec2:LaunchConfiguration (update-existing-nodegroup-ng-2-ondemand-large-nodeLaunchConfiguration):
    error: Plan apply failed: deleting urn:pulumi:dev1::update-existing-nodegroup::eks:index:NodeGroup$aws:ec2/launchConfiguration:LaunchConfiguration::update-existing-nodegroup-ng-2-ondemand-large-nodeLaunchConfiguration: 
    error deleting Autoscaling Launch Configuration (update-existing-nodegroup-ng-2-ondemand-large-nodeLaunchConfiguration-d0932eb):
    ResourceInUse: Cannot delete launch configuration update-existing-nodegroup-ng-2-ondemand-large-nodeLaunchConfiguration-d0932eb because it is attached to AutoScalingGroup update-existing-nodegroup-ng-2-ondemand-large-6410fe15-NodeGroup-1DIVWWS4FCMIU
    status code: 400, request id: f7bfd557-9505-11e9-b696-8ff9971bc5b3

See:

Issue in https://github.com/terraform-providers/terraform-provider-aws/issues/8485
TF work-around, but does not work in pulumi/eks as we do not expose namePrefix as an opt in aws.ec2.LaunchConfiguration
Changing the name of the LaunchConfig resulted in the same error

Manual clean up of the LaunchConfig in the state snapshot and AWS seems to be the only mitigation I've found.

cc @jen20 @lukehoban

lukehoban commented 5 years ago

This is a little surprising. Pulumi does create before delete by defat so a new launch configuration should have been created and the Autoscaling group should have been updated to use it prior to attempting to delete the previous launch configuration.

Could you share a full output of an update that attempts to make this change?

metral commented 5 years ago

Per https://github.com/terraform-providers/terraform-provider-aws/issues/8485#issuecomment-507299533, this was fixed in https://github.com/terraform-providers/terraform-provider-aws/pull/7819 and available in 2.1.0 of tf-aws. We're currently on 2.12.0 but seem to be still hitting this bug.

/cc @jen20 @stack72

lukehoban commented 5 years ago

@metral Does this reliably repro?

metral commented 5 years ago

I have not been able to repro this. It could have been due to a rabbit hole I was in the middle of. Closing this out for now, and I'll re-open if necessary.

ljani commented 4 years ago

I'm still seeing this.

Terraform seems to have some problems as well: https://github.com/terraform-providers/terraform-provider-aws/issues/8485

EDIT: I'm facing this issue, because I changed some VPC configs, which forced Pulumi to recreate the EKS cluster.

ljani commented 4 years ago

I think I found a way to reproduce this:

Create a "basic" EKS setup using pulumi
Make sure your role does not have the permission autoscaling:TerminateInstanceInAutoScalingGroup
- An explicit deny could work as well? Mine was implicit.
Run pulumi up which results in cloud formation stack template change ([diff: ~templateBody]) and it'll fail with not having the permission
- Mine was some change by AWS, I think a worker image update
Run pulumi up again and it will fail with Cannot delete launch configuration ..., because it is attached to AutoScalingGroup ... as there is a aws:ec2:LaunchConfiguration ... completing deletion from previous update

zebulonj commented 3 years ago

For what it's worth, I encounter this any time internal changes in new eks.Cluster(...) cause Pulumi to attempt to change the launch configuration.

bsod90 commented 3 years ago

It seems like it's still an issue for me as well... Just like @zebulonj , I'm using eks.Cluster without much extra configuration. Eventually something amongst the objects it created produces a diff (an AMI id in my case), and then it's starting to fail with

... ResourceInUse: Cannot delete launch configuration cubeapp-eu-central-1-2-primary-ng-nodeLaunchConfiguration-8e54547 because it is attached to AutoScalingGroup cubeapp-eu-central-1-2-primary-ng-55bb153a-NodeGroup-1I26U3PIGT5T0 ...

What would be the best manual workaround for this? Thanks!

oliveratprimer commented 3 years ago

Seeing this every time I try to do a change to my EKS

teddyknox commented 2 years ago

Hi, any update on this? I'm having the same issue.

viveklak commented 2 years ago

Reopened the issue and added to triage queue for next iteration.

tma-unwire commented 2 years ago

I consider rewriting the pulumi_eks stuff to plain pulumi_aws to work around this. Then I will have finer control of when the launch configuration needs to be recreated - which is just about never as we use SpotInst for al that.

ekahaa-support commented 2 years ago

We have had this issue as well. Any updates please?

KrisJohnstone commented 2 years ago

Also having this issue :(

tma-unwire commented 2 years ago

It hyas been a long time.... any news? Is this on the backlog?

tma-unwire commented 2 years ago

Are there any recommended work-arounds?

When this happens, I usually go to the parent autoscaling group in the AWS console and change the link to the launch configuration here. And then re-run the Pulumi job with a refesh...

Not so good, but the best I have seen so far.

benjamin658 commented 2 years ago

Same issue here, I want to adjust the nodeSubnetIds.

benjamin658 commented 2 years ago

I have confirmed that the workaround @tma-unwire provided is workable, also thanks to the pulumi tech team support.

When you encounter the error, here are the steps:

Login to the aws console, you should see the new launch configuration is already created.
Edit the auto-scaling group, and associate it with the new launch config.
Back to the pulumi, and run pulumi refresh.
Run pulumi up again, you should get rid of the error.

stack72 commented 2 years ago

@roothorp please can we try and recreate this issue so that we can isolate what we will need to fix here :)

LucasJC commented 1 year ago

I have confirmed that the workaround @tma-unwire provided is workable, also thanks to the pulumi tech team support.

When you encounter the error, here is the step:
1. Login to the aws console, you should see the new launch configuration is already created.

2. Edit the auto scaling group, associate to the new launch config.

3. Back to the pulumi, and run `pulumi refresh`.

4. Run `pulumi up` again, you should get rid of the error.

I had this same issue and wanted to leave an alternative in case this is not working: In case you dont find the new launch config created, you can create a temporal one and attach it to the asg. This will let pulumi delete the old launch config on a pulumi up.

Afterwards dont forget to delete the temporal launch config

kovaxur commented 1 year ago

We have the same issue, Pulumi create the new LaunchConfiguration then tries to delete the old before replacing with the new, so it fails.

stepan-romankov-fi commented 1 year ago

Same issue? Any plan to fix it?

drawnwren commented 7 months ago

This is still failing.

ceelian commented 3 months ago

This also hit me, it's still an issue.

pulumi / pulumi-eks

Changing the instanceType in an existing NodeGroup results in a 400 failure to delete the LaunchConfig since it is attached to an active ASG #178