Open Mieszko96 opened 9 months ago
There is a cleanup process for leaked ENIs in the VPC CNI. How long is your cluster staying up in total after you scale down?
There is a cleanup process for leaked ENIs in the VPC CNI. How long is your cluster staying up in total after you scale down?
few minutes depends how i write my terraform :).
But tested with wait 20min wait and it was sometimes working sometimes not, so i'm little confused.
@tzneal
17.36 - terraform destroy -target 'module.karpenter[0].helm_release.karpenter_provisioner'
module.karpenter[0].helm_release.karpenter_provisioner: Still destroying... [id=karpenter-provisioner, 10s elapsed]
module.karpenter[0].helm_release.karpenter_provisioner: Still destroying... [id=karpenter-provisioner, 20s elapsed]
module.karpenter[0].helm_release.karpenter_provisioner: Still destroying... [id=karpenter-provisioner, 30s elapsed]
module.karpenter[0].helm_release.karpenter_provisioner: Still destroying... [id=karpenter-provisioner, 40s elapsed]
module.karpenter[0].helm_release.karpenter_provisioner: Still destroying... [id=karpenter-provisioner, 50s elapsed]
module.karpenter[0].helm_release.karpenter_provisioner: Still destroying... [id=karpenter-provisioner, 1m0s elapsed]
module.karpenter[0].helm_release.karpenter_provisioner: Still destroying... [id=karpenter-provisioner, 1m10s elapsed]
module.karpenter[0].helm_release.karpenter_provisioner: Still destroying... [id=karpenter-provisioner, 1m20s elapsed]
module.karpenter[0].helm_release.karpenter_provisioner: Still destroying... [id=karpenter-provisioner, 1m30s elapsed]
module.karpenter[0].helm_release.karpenter_provisioner: Still destroying... [id=karpenter-provisioner, 1m40s elapsed]
module.karpenter[0].helm_release.karpenter_provisioner: Destruction complete after 1m42s
during this terraform cluster was scale down 17.38 scaled down
17.40 network interface stays
aws-K8S-i-0645b6195e3cdf9a7 | – | Available aws-K8S-i-0b8a85cb97a8a5209 | – | Available
17.44 still 2 network interfaces
Also in cloudtrail i see
February 01, 2024, 17:37:25, was attempt to delete those network interfaces
I assume there was try to delete those network interfaces but they were still attached. Is there any repeat try in karpenter? or my assumptions are wrong?
17.50 still 2 network interfcaes and don't see anything in cloudtrail
If my assumption is correct
I assume there was try to delete those network interfaces but they were still attached
Is there a way that there will be an introduced retry mechanism for deleting those network interfaces? Or run delete network interface only if network interface is in avaliable status, that can't be deleted
We're working on improving this, but this is a known issue with very short lived nodes/clusters. I'll leave this issue open to track fixing it.
@tzneal @engedaam Facing the same issue. Terraform is not able to delete SG which is attached to ENI and this ENI was attached to the terminated karpenter node
@tzneal Ant update on the improvement? We're facing the same issue when ENIs are left behind when node is terminated
Description
Observed Behavior:
network interface should be deleted after scale down
Versions:
kubectl version
): 1.27