kubernetes-sigs / cluster-api-provider-aws

Kubernetes Cluster API Provider AWS provides consistent deployment and day 2 operations of "self-managed" and EKS Kubernetes clusters on AWS.
http://cluster-api-aws.sigs.k8s.io/
Apache License 2.0
632 stars 559 forks source link

Reconciler tries to delete security groups in use during cluster deletion #4985

Open jfcavalcante opened 3 months ago

jfcavalcante commented 3 months ago

/kind bug

What steps did you take and what happened:

After deleting a newly provisioned cluster, I've could see that the deletion process isn't running smoothly. Even during the deletion state, CAPA seems to try to delete some used security groups.

It looks like the reconciler cannot filter used security groups before trying to delete the resource, resulting in this error, which can be confusing for a new user of ClusterAPI.

E0518 20:50:20.738949       1 controller.go:329] "Reconciler error" err=<
    [error deleting security groups: [failed to delete security group "sg-092fbe4cd72832838" with name "capi-quickstart-apiserver-lb": DependencyViolation: resource sg-092fbe4cd72832838 has a dependent object
        status code: 400, request id: cd701b81-82c1-443b-8dd9-68202c055253, failed to delete security group "sg-0cc1b0c811e30c540" with name "capi-quickstart-controlplane": DependencyViolation: resource sg-0cc1b0c811e30c540 has a dependent object
        status code: 400, request id: ee4da697-4b02-47f8-9498-5f75cc952d66], error deleting network: failed to delete vpc "vpc-0bf1981f720b6c560": DependencyViolation: The vpc 'vpc-0bf1981f720b6c560' has dependencies and cannot be deleted.
        status code: 400, request id: b89ff4bd-718d-4d4e-928e-dd299e5b4b01]
 > controller="awscluster" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="AWSCluster" AWSCluster="default/capi-quickstart" namespace="default" name="capi-quickstart" reconcileID="07010877-5547-4634-9c61-99386120deed"
I0518 20:50:20.739420       1 awscluster_controller.go:208] "Reconciling AWSCluster delete" controller="awscluster" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="AWSCluster" AWSCluster="default/capi-quickstart" namespace="default" name="capi-quickstart" reconcileID="851dc5ae-6cc8-46ae-b436-0cce6c06bf57" cluster="default/capi-quickstart"

What did you expect to happen:

The controller to check if the resources related to a cluster are able to be deleted.

Environment:

k8s-ci-robot commented 3 months ago

This issue is currently awaiting triage.

If CAPA/CAPI contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.
k8s-triage-robot commented 1 week ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale