Unable to delete a cluster when infrastructureRef is defined incorrectly

thebsdbox commented 5 years ago

What steps did you take and what happened: Creating a cluster definition with the incorrect infrastructureSpec results in a cluster resource that can't be deleted, also exhibiting the same behaviour on the namespace where it exists.

Example Cluster:

apiVersion: cluster.x-k8s.io/v1alpha2
kind: Cluster
metadata:
  name: blank
  namespace: blank
spec:
  clusterNetwork:
    services:
      cidrBlocks: ["10.96.0.0/12"]
    pods:
      cidrBlocks: ["192.168.0.0/16"]
    serviceDomain: "cluster.local"
    apiServerPort: 6433
  infrastructureRef:
    apiVersion: blank.cluster.k8s.io/v1alpha1
    kind: blankCluster
    name: blankTest
    namespace: blank

Applying this will create a new cluster resource in the namespace blank as expected:

k create namespace blank; k create -f ./blank.yaml

What did you expect to happen:

That deleting this erroneous cluster resource or it's namespace, that it would be cleaned up from the cluster. However at this point it will hang indefinitely (even with force):

k get cluster -n blank
NAME    PHASE
blank   provisioning
k delete cluster blank -n blank
cluster.cluster.x-k8s.io "blank" deleted
<hang>

Anything else you would like to add:

As pointed out by @detiber, editing the resource and removing the infrastructureRef that it will be removed as expected:

k edit cluster blank -n blank
cluster.cluster.x-k8s.io/blank edited
k delete cluster blank -n blank
Error from server (NotFound): clusters.cluster.x-k8s.io "blank" not found

Environment:

Cluster-api version: 0.2.5
Minikube/KIND version: N/A (vanilla deployment on VMs)
Kubernetes version: (use kubectl version): 1.14.1
OS (e.g. from /etc/os-release): Ubuntu 18.04

/kind bug

detiber commented 5 years ago

/priority important-soon /milestone v0.3.0

prankul88 commented 5 years ago

@thebsdbox I am facing the same issue. Will work on it.

wfernandes commented 5 years ago

@prankul88 Are you still working on this issue? Feel free to /assign and /lifecycle active. Thanks!

prankul88 commented 5 years ago

@wfernandes Yes I am working on it.

/assign /lifecycle active

wfernandes commented 4 years ago

@prankul88 Thanks for reaching out. This is what I managed to find regarding this issue.

Result

I could reproduce your behavior in an environment with only CAPI components installed. That is no infrastructure provider components. The issue was not reproducible in an environment where the appropriate infrastructure provider components were installed as well. See 2nd experiment below.

The cluster controller performs a "foreground deletion", that is, it tries to delete the cluster's dependents before deleting the cluster object itself. In this case, the dependents are objects with owner references and the infrastructureRef. However, we get the following error (focus on the no matches for kind "AWSCluster")

    E1119 21:50:06.916110       1 controller.go:218] controller-runtime/controller "msg"="Reconciler error" "error"="failed to get infrastructure.cluster.x-k8s.io/v1alpha3/AWSCluster \"test-aws-not-here\" for Cluster foobar/test: no matches for kind \"AWSCluster\" in version \"infrastructure.cluster.x-k8s.io/v1alpha3\""  "controller"="cluster" "request"={"Namespace":"foobar","Name":"test"}

from the line 260 below https://github.com/kubernetes-sigs/cluster-api/blob/31f119ce49479505c6bf9201316e4f9ef6a0361d/controllers/cluster_controller.go#L255-L262 because the AWSCluster CRD does not exist in this environment. Because we get this error the cluster object deletion doesn't follow through and the kubectl command hangs. However, in an environment where AWS provider components are installed, it falls into the apierrors.IsNotFound(err) case during which reconcileDelete returns nil, after which the cluster object is successfully deleted.

IMO I believe the controller is behaving as expected. This issue is only reproducible when the environment is incorrectly setup.

@detiber or @ncdc can verify my findings. I'm happy to explain further if this doesn't make sense.

Experiments

1. Environment with ONLY CAPI components installed.

I could reproduce your behavior in an environment with only CAPI components installed. That is no infrastructure provider components.

Environment: cluster-api GIT REF: 31f119ce4

Steps to test:

kind create cluster --name=onlycapi
export KUBECONFIG="$(kind get kubeconfig-path --name="onlycapi")"
make release-manifests
kubectl apply -f out/cluster-api-components.yaml
kubectl create ns foobar

echo "apiVersion: cluster.x-k8s.io/v1alpha3
kind: Cluster
metadata:
 name: test
  namespace: foobar
spec:
  clusterNetwork:
  pods:
      cidrBlocks:
      - 192.168.0.0/16
  infrastructureRef:
    apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
    kind: AWSCluster
    name: test-aws-not-here
    namespace: foobar" | kubectl apply -f -

Saw the following in the CAPI controller logs.

E1119 21:46:49.476864       1 controller.go:218] controller-runtime/controller "msg"="Reconciler error" "error"="no matches for kind \"AWSCluster\" in version \"infrastructure.cluster.x-k8s.io/v1alpha3\""  "controller"="cluster" "request"={"Namespace":"foobar","Name":"test"}

kubectl delete cluster test -n foobar The above command hangs. Below is the log line outputted in the CAPI controller.

E1119 21:50:06.916110       1 controller.go:218] controller-runtime/controller "msg"="Reconciler error" "error"="failed to get infrastructure.cluster.x-k8s.io/v1alpha3/AWSCluster \"test-aws-not-here\" for Cluster foobar/test: no matches for kind \"AWSCluster\" in version \"infrastructure.cluster.x-k8s.io/v1alpha3\""  "controller"="cluster" "request"={"Namespace":"foobar","Name":"test"}

2. Environment with CAPI and CAPA components installed.

Environment: cluster-api-provider-aws git ref: 40a6a24f

Steps to Test

Create a kind cluster
kubectl create ns foobar
Deploy CAPA provider components along with CAPI components
Apply the cluster above and verify the object has been created. Notice there are errors in the CAPI and CAPA controllers because the infra ref is incorrect or does not exist.
kubectl delete cluster test -n foobar Here the command does NOT hang and expected behavior is actual.

prankul88 commented 4 years ago

@wfernandes Hi, thank you for the inputs. I will try to give a clear description of what I have been doing to reproduce the situation.

I had set up with both CAPI and CAPO components installed.

kind create cluster --name=clusterapi-test
kubectl create -f https://github.com/kubernetes-sigs/cluster-api/releases/download/v0.2.7/cluster-api-components.yaml
kubectl create -f https://github.com/kubernetes-sigs/cluster-api-provider-openstack/releases/download/v0.2.0/infrastructure-components.yaml

Case 1: Created cluster object with incorrect InfrastructureRef (errors in CAPI Controller due to that)

kind: Cluster
metadata:
  name: capi-quickstart
spec:
  clusterNetwork:
    pods:
      cidrBlocks: ["192.168.0.0/16"]
  infrastructureRef:
    apiVersion: blank

Now running k delete cluster capi-quickstart does get stuck in "deleting" phase.

Case 2: Created cluster object (yaml below) - k apply -f cluster.yaml where OpenStackCluster.Spec is incorrect (errors in CAPO controller)

kind: Cluster
metadata:
  name: capi-quickstart
spec:
  clusterNetwork:
    services:
      cidrBlocks: ["10.96.0.0/12"]
    pods:
      cidrBlocks: ["192.168.0.0/16"] # CIDR block used by Calico.
    serviceDomain: "cluster.local"
  infrastructureRef:
    apiVersion: infrastructure.cluster.x-k8s.io/v1alpha2
    kind: OpenStackCluster
    name: capi-quickstart
---
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha2
kind: OpenStackCluster
metadata:
  name: capi-quickstart
spec:
  cloudName: ${OPENSTACK_CLOUD}
  cloudsSecret:
    name: cloud-config
  nodeCidr: ${NODE_CIDR}
  externalNetworkId: ${OPENSTACK_EXTERNAL_NETWORK_ID}
  disablePortSecurity: true
  disableServerTags: true

k get clusters

NAME              PHASE
capi-quickstart   provisioning

Now when I try deleting cluster, I expect the cluster to first delete all the objects with its ownerReferences and InfrastructureRef (i.e OpenStackCluster object must be deleted) and then deletes itself. But the command k delete cluster capi-quickstart gets stuck at "deleting" phase.

wfernandes commented 4 years ago

@prankul88 Thanks for the clarification of your steps of reproduction. I believe that I was able to reproduce the behavior with CAPI and CAPA provider components and have a better understanding of what's going on.

Experiment

After setting up an environment with CAPI/CAPA components, I applied the yaml below to create your Case 2.

apiVersion: cluster.x-k8s.io/v1alpha3
kind: Cluster
metadata:
  name: wff-test
  namespace: default
spec:
  clusterNetwork:
    pods:
      cidrBlocks:
      - 192.168.0.0/16
  infrastructureRef:
    apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
    kind: AWSCluster
    name: wff-test
    namespace: default
---
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
kind: AWSCluster
metadata:
  name: wff-test
  namespace: default
spec:
  region: us-east-2
  sshKeyName: SUPER-DUPER-BAD-SSH-KEY

Now when we issue a DELETE to a Cluster object (kubectl delete cluster wff-test) , the capi-controller-manager eventually issues a DELETE request to delete the Infrastructure reference. https://github.com/kubernetes-sigs/cluster-api/blob/08b29593e201a02a9805b8f839e5e22b789b0ff3/controllers/cluster_controller.go#L266-L270
The Infrastructure provider controller such as capa-controller-manager gets that request and begins to reconcileDelete.
- If the deletion is successful, the finalizer on the AWSCluster object is removed, which then allows for that object to be deleted from etcd. Once that happens, the capi-controller-manager will find that the InfraRef is gone and promptly remove the finalizer on the Cluster object. Then k8s will remove the Cluster object from etcd. When that happens the kubectl command is returned.
- If the deletion is unsuccessful, the finalizer on the AWSCluster object is kept on the object which keeps it in etcd. Because the AWSCluster object is around, the Cluster object doesn't get cleaned up. The removal of the finalizer behavior is the same for the OpenStackCluster controller. https://github.com/kubernetes-sigs/cluster-api-provider-openstack/blob/f993fb3706a8a01956442cd512cce9a979e1dc69/controllers/openstackcluster_controller.go#L273-L276
kubectl sets the PropagationPolicy to Background by default. After it issues a DELETE request on the Cluster object, it does a GET to the Cluster object to verify if it is removed from etcd. Since the Cluster object remains in etcd, kubectl hangs for this case. kubectl delete cluster wff-test -v9 provides more info explaining this point.

Result

So the easiest way to get around this particular issue is to either:

kubectl delete cluster wff-test --wait=false which does not wait for finalizers and returns immediately. However, the objects will still remain in etcd.
Delete the finalizer from the AWSCluster object (kubectl edit awscluster wff-test) and manually verify the environment does not have any resources left around.
Or you can edit the infrastructure ref on the Cluster object like @detiber suggested initially.

Ideally, there would be a way to propagate the error from the capa-controller-manager to capi-controller-manager to kubectl. If there is an error in deleting the AWSCluster object we could add a FailureReason and FailureMessage field on the AWSClusterStatus field. Then in the capi-controller-manager we could check the AWSClusterStatus before issuing a DELETE to see if there is an error. This error can be tracked as an event on the Cluster object or as a metric if necessary

However, I'm not sure how to propagate the error to kubectl 🤷‍♂

wfernandes commented 4 years ago

@prankul88 As always, let me know if anything doesn't make sense and I'll be happy to explain some more. I suggest trying to get more input from the community to better understand what the expected behavior should be.

prankul88 commented 4 years ago

@wfernandes Really appreciate the details you have mentioned. Will wait for more inputs before moving forward.

thebsdbox commented 4 years ago

Hello,

I raised the issue mainly as it is certainly confusing behaviour for end-users that don't know where to look or even that they will need to start manually editing various object spec. It's more of a UX issue if the end-user can't be notified that the delete operation is failing due to a mis-aligned reference I suppose.

prankul88 commented 4 years ago

@thebsdbox I completely agree with you! The end-user must be intimated of the possible failure.

ncdc commented 4 years ago

I think there are a few different scenarios here:

Cluster infraRef points to a Kind (CRD) that does not exist
Cluster infraRef points to a resource that does not exist (CRD is there, but specific instance is not)
Cluster reconciler encounters some sort of other error (permissions, downtime, whatever) when trying to retrieve the infrastructure resource
Cluster reconciler can't proceed with deletion because the infrastructure resource is "stuck" deleting

Does that sound accurate? Am I missing anything?

joonas commented 4 years ago

@prankul88 I just wanted to check-in to see if you were still planning on working on this and if there was anything you needed to move this forward?

ncdc commented 4 years ago

@prankul88 checking in again - any updates here?

prankul88 commented 4 years ago

@joonas @ncdc I am planning to take this forward. Need a little knowledge on how to propagate the error to kubectl

Cluster infraRef points to a Kind (CRD) that does not exist

This does not create the cluster object. So no issues there.

ncdc commented 4 years ago

This does not create the cluster object. So no issues there.

I'm not sure what you mean. The original report states "Creating a cluster definition with the incorrect infrastructureSpec results in a cluster resource that can't be deleted...". Could you please clarify?

Need a little knowledge on how to propagate the error to kubectl

Let's take things one step at a time. Which scenario are you trying to address, and where is the error getting "lost"?

prankul88 commented 4 years ago

@ncdc I will try to explain the scenarios which you had mentioned above in your comment.

Cluster infraRef points to a Kind (CRD) that does not exist

This is valid,(sorry for the last comment, confused it with something else). It is stuck at "deleting" phase as expected. The delete command would try to delete the Infrastructure reference and then itself eventually. Since infraRef points to a Kind that doesnot exist, it is stuck in the "deleting" phase.

Cluster infraRef points to a resource that does not exist (CRD is there, but specific instance is not)

This does not delete the InfraRef object with commad k delete cluster <name> but I don't think this issue is targeting this scenario.

Cluster reconciler encounters some sort of other error (permissions, downtime, whatever) when trying to retrieve the infrastructure resource

Cluster reconciler can't proceed with deletion because the infrastructure resource is "stuck" deleting

These are more likely to be covered.

Please let me know if something is missing. Thanks

ncdc commented 4 years ago

Thanks. Could you pick one scenario and describe in detail what you'd like the user experience to be?

kubectl delete ...
???
etc

prankul88 commented 4 years ago

@ncdc Sure. Let's take case1 where Cluster infraRef points to a Kind (CRD) that does not exist

I apply file :

kind: Cluster
metadata:
  name: capi-quickstart
spec:
  clusterNetwork:
    pods:
      cidrBlocks: ["192.168.0.0/16"]
  infrastructureRef:
    apiVersion: infrastructure.cluster.x-k8s.io/v1alpha2
    kind: blank
    name: capi-quickstar
---
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha2
kind: DockerCluster
metadata:
  name: capi-quickstart

Currently the output of command k delete cluster capi-quickstart is something like

cluster.cluster.x-k8s.io "capi-quickstart" deleted

and does not actually complete the command. Rather when I do k get clusters, cluster is still in "Deleting" phase.

NAME              PHASE
capi-quickstart   deleting

My expected behaviour after the objects are created is something like/

k delete cluster capi-quickstart
[kubectl must throw the logs why the cluster can't be deleted] Eg, Cannot delete object: no matches for kind 'blank' in infrastructureRef
The command must exit and not have a race condition in any case.

ncdc commented 4 years ago

@prankul88 I do not believe we'll be able to implement what you've described. When you issue kubectl delete, it has a --wait flag that defaults to true. If a resource has finalizers, the apiserver sets the deletion timestamp on the resource, and the --wait=true flag causes kubectl to wait for the resource's finalizers to be removed, and for the resource ultimately to be removed from etcd. If there is still a finalizer on the resource, which is what happens in case 1, there is nothing kubectl can do in its current form to give you any additional information as to what is going on. If you ctrl-c the kubectl delete call, the resource still has its deletion timestamp set, and the apiserver is still waiting for all the finalizers to be removed. This is the standard behavior for all Kubernetes resources, both built-in types and custom resources, and there is no way to alter the behavior of either the apiserver or kubectl without making changes to Kubernetes.

I think it may be sufficient to modify ClusterReconciler.reconcileDelete() to have it skip over 404 not found errors here:

https://github.com/kubernetes-sigs/cluster-api/blob/065eb539766dede097e206a7b549b5902d15f14a/controllers/cluster_controller.go#L256

vincepri commented 4 years ago

Bumping this from the milestone for now, I think we should open smaller issues targeting what @ncdc listed in https://github.com/kubernetes-sigs/cluster-api/issues/1566#issuecomment-568034735 and address these separately in v0.3.x

/milestone Next

vincepri commented 4 years ago

/priority awaiting-more-evidence

Is there any action items that we can take from here or is this good to close?

vincepri commented 4 years ago

Closing this due to inactivity, please feel free to reopen

/close

k8s-ci-robot commented 4 years ago

@vincepri: Closing this issue.

In response to [this](https://github.com/kubernetes-sigs/cluster-api/issues/1566#issuecomment-616746088): >Closing this due to inactivity, please feel free to reopen > >/close Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.

kubernetes-sigs / cluster-api