Closed lukehoban closed 1 year ago
Another example that is likely related - from https://pulumi-community.slack.com/archives/C84L4E3N1/p1563557008032100:
Just trying to get the initial cluster set up, and made some silly mistakes (set subnets to public, not private). But trying to make changes to the cluster config is crazy. It tries to replace the cluster, but then gets stuck since it cant delete the resources for the now deleted cluster.
Deleting everything now fails with dial tcp: lookup xxx.gr7.us-east-1.eks.amazonaws.com: no such host
We've decided that this is too risky a change to take at this point in Q3. We will fix this ASAP post-1.0.
Another example that is likely related - from https://pulumi-community.slack.com/archives/C84L4E3N1/p1563557008032100:
Just trying to get the initial cluster set up, and made some silly mistakes (set subnets to public, not private). But trying to make changes to the cluster config is crazy. It tries to replace the cluster, but then gets stuck since it cant delete the resources for the now deleted cluster.
Deleting everything now fails with dial tcp: lookup xxx.gr7.us-east-1.eks.amazonaws.com: no such host
What's the recommended solution to get out of this weird state? I'm having similar issues as described in the Slack message however can't see the responses due to the 10,000 message limit. Error log below:
Type Name Status Info
pulumi:pulumi:Stack xxx-xxx-xxx-service-dev **failed** 1 error
- ├─ aws:ec2:SecurityGroup xxx-xxx-dev **deleting failed** 1 error
- └─ aws:lb:TargetGroup xxx-xxx-dev **deleting failed** 1 error
Diagnostics:
pulumi:pulumi:Stack (xxx-xxx-xxx-service-dev):
error: update failed
aws:lb:TargetGroup (xxx-xxx):
error: deleting urn:pulumi:dev::xxx-xxx-xxx-service::aws:lb:ApplicationLoadBalancer$awsx:lb:ApplicationTargetGroup$aws:lb/targetGroup:TargetGroup::xxx-targetdev: 1 error occurred:
* Error deleting Target Group: ResourceInUse: Target group 'arn:aws:elasticloadbalancing:eu-west-1:675965213304:targetgroup/xxx-targetdev-74e5679/2fa26820b86b102b' is currently in use by a listener or a rule
status code: 400, request id: 72e44b5c-f97a-4f80-9f04-0bece5688359
aws:ec2:SecurityGroup (xxx-cluster-dev):
error: deleting urn:pulumi:dev::xxx-xxx-xxx-service::awsx:x:ecs:Cluster$awsx:x:ec2:SecurityGroup$aws:ec2/securityGroup:SecurityGroup::xxx-cluster-dev: 1 error occurred:
* Error deleting security group: DependencyViolation: resource sg-07d619669ce3f4793 has a dependent object
status code: 400, request id: 47918f5f-1a1a-44be-9772-32a6e73167aa
This would be a great quality of life improvement! I've run into both of the problems Luke mentioned in the description.
Another member of the internal team hit this today.
Their first updated did the create side of a replacement of a LaunchConfiguration.
++ aws:ec2:LaunchConfiguration ecsClusterInstanceLaunchConfiguration create-replacementd
Then the update failed for a reasonable reason.
The next update they did failed almost immediately with:
ecsClusterInstanceLaunchConfiguration (aws:ec2:LaunchConfiguration)
completing deletion from previous update
error: deleting urn:pulumi:kimberley::pulumi-service::aws:ec2/launchConfiguration:LaunchConfiguration::ecsClusterInstanceLaunchConfiguration: 1 error occurred:
* error deleting Autoscaling Launch Configuration (ecsClusterInstanceLaunchConfiguration-13f7e0f): ResourceInUse: Cannot delete launch configuration ecsClusterInstanceLaunchConfiguration-13f7e0f because it is attached to AutoScalingGroup autoScalingGroupStack-4a63cb8-Instances-L4JB1QE2ZJ6J
status code: 400, request id: 88a8a416-5fd0-48a9-9d5d-52358c77e2df
Another example that is likely related - from https://pulumi-community.slack.com/archives/C84L4E3N1/p1563557008032100:
Just trying to get the initial cluster set up, and made some silly mistakes (set subnets to public, not private). But trying to make changes to the cluster config is crazy. It tries to replace the cluster, but then gets stuck since it cant delete the resources for the now deleted cluster.
Deleting everything now fails with dial tcp: lookup xxx.gr7.us-east-1.eks.amazonaws.com: no such host
What's the recommended solution to get out of this weird state? I'm having similar issues as described in the Slack message however can't see the responses due to the 10,000 message limit. Error log below:
Type Name Status Info pulumi:pulumi:Stack xxx-xxx-xxx-service-dev **failed** 1 error - ├─ aws:ec2:SecurityGroup xxx-xxx-dev **deleting failed** 1 error - └─ aws:lb:TargetGroup xxx-xxx-dev **deleting failed** 1 error Diagnostics: pulumi:pulumi:Stack (xxx-xxx-xxx-service-dev): error: update failed aws:lb:TargetGroup (xxx-xxx): error: deleting urn:pulumi:dev::xxx-xxx-xxx-service::aws:lb:ApplicationLoadBalancer$awsx:lb:ApplicationTargetGroup$aws:lb/targetGroup:TargetGroup::xxx-targetdev: 1 error occurred: * Error deleting Target Group: ResourceInUse: Target group 'arn:aws:elasticloadbalancing:eu-west-1:675965213304:targetgroup/xxx-targetdev-74e5679/2fa26820b86b102b' is currently in use by a listener or a rule status code: 400, request id: 72e44b5c-f97a-4f80-9f04-0bece5688359 aws:ec2:SecurityGroup (xxx-cluster-dev): error: deleting urn:pulumi:dev::xxx-xxx-xxx-service::awsx:x:ecs:Cluster$awsx:x:ec2:SecurityGroup$aws:ec2/securityGroup:SecurityGroup::xxx-cluster-dev: 1 error occurred: * Error deleting security group: DependencyViolation: resource sg-07d619669ce3f4793 has a dependent object status code: 400, request id: 47918f5f-1a1a-44be-9772-32a6e73167aa
got exactly this error too,
merely changed some vpc stuff and then it decided it was time to delete the target group and now can't get rid of "completing deletion from previous update..." state
any suggested workarounds for this issue?
Same here... tried to manually remove the resource from the stack to no avail. Now my stack has two identical resources...
Do you want to perform this update? yes Updating (CLIENT/ENV)
View Live: https://app.pulumi.com/CLIENT/STACK/ENV/updates/NN
Type Name Status Info
pulumi:pulumi:Stack RESOURCE_NAME **failed** 1 error
Diagnostics:
gcp:projects:IAMMember (BINDING_NAME):
error: unable to find required configuration setting: GCP Project
Set the GCP Project by using:
pulumi config set gcp:project <project>
Resources:
Duration: 2s
Ok, found a workaround. It's not pretty but does the job:
pulumi stack export -s STACK > stack.json
cp stack.json stack.json.origin
vi stack.json
pulumi stack select STACK
pulumi stack import < stack.json
pulumi up
@ralvarez-globant You say:
3. Modify your stack.json (vim or whatever editor you choose). Just make sure to remove the source conflict ( remove your conflicting resources and manually clean up your infrastructure)
Did you just remove the problematic resource itself? I would imagine that you need to remove any other resources the reference it as a dependency, as well.
Today, we process any
pendingDeletes
at the beginning of a deployment. This is not "correct".Two examples:
First, a program with a VPC and an Instance. A change causes the VPC to be replaced, and the Instance fails to create. This leads to a newly created VPC, and a pending delete VPC. On the next update, we try to flush the pending deletes, meaning trying to delete the old VPC. This fails, because the Instance is still running in the old VPC. It is only "correct" to delete the old VPC at the end of the deployment after all other repercussions of the replacement have been made.
Second, a Kubernetes Provider and a Kubernetes Resource. A change causes the Kubernetes Provider to be replaced, but the Kubernetes Resource fails to create. This leads to a newly created Provider in the checkpoint, and a pending delete Provider in the checkpoint. On the next update, we successfully delete the pending delete Provider from the checkpoint. However, now all of the references in the checkpoint have provider references to a provider which does not exist. When we try to process the recreate of the Kubernetes Resource, it fails with a message like:
To be correct, I believe we will need to postpone pending deletes to the end of the deployment.