kubernetes-sigs / aws-load-balancer-controller

A Kubernetes controller for Elastic Load Balancers
https://kubernetes-sigs.github.io/aws-load-balancer-controller/
Apache License 2.0
3.94k stars 1.46k forks source link

NLBs not deleted when the corresponding k8s services are deleted #3841

Open menghanl opened 1 month ago

menghanl commented 1 month ago

Describe the bug

Steps to reproduce The immediately action that resulted in this unexpected behavior was just to delete the k8s service. (We use helm, if that's important. We just deleted the service.yaml template, and re-deploy helm)

One thing that I'm not sure if it's important is, the NLB is pretty old (Created "October 27, 2023, 12:07 (UTC-07:00)"). So we've probably upgraded the aws-load-balancer-controller several times since.

Expected outcome The aws resources (NLB, target group, etc) get deleted when the k8s service is deleted

Environment

Additional Context:

As mentioned above. The leaked NLB is pretty old, and was probably created by a different version of aws-load-balancer-controller. Not sure if that's related or not.

menghanl commented 1 month ago

This is hard to reproduce.

We have another cluster with a very similar setup (dev vs staging). I just tried the same thing in the other cluster, but this time the NLBs got successfully deleted.

The only difference is, before I deployed to the second cluster, I updated to the aws-load-balancer-controller deployment to turn on debugging logs. This triggered a restart of the aws-load-balancer-controller pods.

zac-nixon commented 1 month ago

/kind bug

zac-nixon commented 1 month ago

Hello! Thanks for reaching out.

It is odd that your load balancer was leaked. Our current thinking is that the needed load balancer tags were removed (somehow) which causes this leakage. As you mentioned this seems to be hard to reproduce but we can give it a try. Do you have the version of the load balancer controller was used to create the NLB initially? Having this data would make the reproduction easier.

menghanl commented 1 month ago

Sorry, I don't have the old load balancer controller version we used... But the leaked LB was created "October 27, 2023, 12:07 (UTC-07:00)". So I would assume we used the "lastest" aws-load-balancer-controller at that date.