Closed ialidzhikov closed 4 years ago
Hi @ialidzhikov ,
Thanks for this info. I think this is caused by this call - https://github.com/gardener/machine-controller-manager/blob/master/pkg/driver/driver_aws.go#L288. This call was introduced to capture a bug in deletion of all volumes attached to machines by default. However, in this case, maybe the disk was never attached properly or something. We will look into this.
We can continue the discussion of how to best fix this here. Also, I am not 100% sure how to reproduce this but i think on shoot deletions this should be re-producible.
CC: @hardikdr @ggaurav10 @amshuman-kr
What happened: We hit a case in which there is an ec2 instance that leaks and blocks the Shoot deletion (as the instance blocks the deletion of the subnet for the nodes and the security group for nodes).
What you expected to happen:
How to reproduce it (as minimally and precisely as possible):
Expand the details the see all of the logs:
The critical part from the logs seems to be
The existing (orphan) instance is with id
i-080641c8fcb74ec59
and Name tag{"Key": "Name", "Value": "shoot--it--tmdtw-w19-worker-1-z1-5b88647f77-wtmpj"}
.Anything else we need to know:
Environment: machine-controller-manager version: v0.27.0