kubernetes-sigs / cloud-provider-azure

Cloud provider for Azure
https://cloud-provider-azure.sigs.k8s.io/
Apache License 2.0
264 stars 274 forks source link

K8S delete of service loadbalancer fails to clean up and exits prematurely #6505

Closed chapmanc closed 2 weeks ago

chapmanc commented 4 months ago

What happened:

K8S service deletion proceeds without cleaning up the load balancer.

What you expected to happen:

The service should clean up all the dependent resources. Instead we have a warning in the events that if failed to delete and then it gives up and deletes the service. The Etag doesn't seem to match but it doesn't update or try to update itself. This results in the load balancer being orphaned and left behind.

error syncing load balancer: failed to delete load balancer: Retriable: false, RetryAfter: 0s, HTTPStatusCode: 412, RawError: {
    "error": {
        "code": "PreconditionFailed",
        "message": "Precondition failed.",
        "details": [
            {
            "code": "PreconditionFailedEtagMismatch",
            "message": "Etag provided in if-match header W/\"9ec66ae2-c704-4c45-bf93-6fb73345dfb1\" does not match etag W/\"c9dd49fc-1247-46dd-8cff-14af6bd1b343\" of resource /subscriptions/<redacted>/resourceGroups/MC_<redacted>/providers/Microsoft.Network/loadBalancers/kubernetes-internal in NRP data store."
            }
        ]
    } 
}

How to reproduce it (as minimally and precisely as possible):

  1. Created k8s service loadbalancer
  2. Added some backend pools manually
  3. Removed backend pools manually
  4. Triggered a delete of the service in k8s
  5. Service is deleted but the load balancer still exists

Anything else we need to know?:

We have a controller that adds some things to the lb and then deletes them before the service is deleted. We have verified that all modifications it makes are removed and the manager is no longer running by the time it gets to the delete.

Environment:

k8s-triage-robot commented 1 month ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot commented 2 weeks ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

nilo19 commented 2 weeks ago

@chapmanc This is expected as you manually change the load balancer. All managed resources including the lb should not be touched.