EnMasseProject / enmasse

EnMasse - Self-service messaging on Kubernetes and OpenShift
https://enmasseproject.github.io
Apache License 2.0
190 stars 87 forks source link

Resources not deleted uninstalling enmasse in ocp4 using OLM #2863

Open famarting opened 5 years ago

famarting commented 5 years ago

In an openshift cluster with enmasse installed using OLM, the uninstall process don't remove all resources created by enmasse

In the testing environment console, authentication services, address spaces and related secrets are not deleted when uninstalling enmasse.

In an OLM installation the installation is represented by the clusterserviceversion resource so I uninstall enmasse using this command: oc delete csv enmasse.0.28.0

If needed contact me to have access to an openshift 4 cluster to check the issue

k-wall commented 5 years ago

Testing myself on OCP4 with EnMasse 0.28.2:

With EnMasse 0.28.2 deployed using OLM, with an address space, an address (queue), and an IoTConfig applied I see the following resources:

Before:

[keith@rhfed-localdomain enmasse_downstream]$ oc get pods
NAME                                       READY     STATUS    RESTARTS   AGE
address-space-controller-898878fd8-lzxff   1/1       Running   0          7m33s
admin.mweea7iu3u-5b66ddd86b-94g87          2/2       Running   0          3m18s
api-server-7cb544dcb6-rmwjz                1/1       Running   0          7m30s
broker-mweea7iu3u-1zk6-0                   1/1       Running   0          2m58s
console-645888d655-2kv7x                   2/2       Running   0          3m11s
enmasse-operator-7bd88b46c-j79ps           1/1       Running   0          7m33s
iot-auth-service-59649d6748-f9qdc          1/1       Running   0          60s
iot-device-registry-dff9978d-44fv7         1/1       Running   0          60s
iot-gc-658ff6455d-s5klc                    1/1       Running   0          60s
iot-http-adapter-67967db454-jk6mm          3/3       Running   0          60s
iot-mqtt-adapter-7784854694-64ctx          3/3       Running   0          60s
iot-tenant-service-59cd9c4989-4874k        1/1       Running   0          60s
none-authservice-7bfcd4655d-kdx5c          1/1       Running   0          5m19s
qdrouterd-mweea7iu3u-0                     1/1       Running   0          3m18s
standard-authservice-576b747759-4g7ls      1/1       Running   0          5m17s
user-api-server-7df5457c95-skhtm           1/1       Running   0          7m30s

and following the ui instructions to delete an operator the following items are left behind:

[keith@rhfed-localdomain enmasse_downstream]$ oc get pods
NAME                                    READY     STATUS    RESTARTS   AGE
admin.mweea7iu3u-5b66ddd86b-94g87       2/2       Running   0          11m
broker-mweea7iu3u-1zk6-0                1/1       Running   0          11m
console-645888d655-2kv7x                2/2       Running   0          11m
iot-auth-service-59649d6748-f9qdc       1/1       Running   0          9m10s
iot-device-registry-dff9978d-44fv7      1/1       Running   0          9m10s
iot-gc-658ff6455d-s5klc                 1/1       Running   0          9m10s
iot-http-adapter-67967db454-jk6mm       3/3       Running   0          9m10s
iot-mqtt-adapter-7784854694-64ctx       3/3       Running   0          9m10s
iot-tenant-service-59cd9c4989-4874k     1/1       Running   0          9m10s
none-authservice-7bfcd4655d-kdx5c       1/1       Running   0          13m
qdrouterd-mweea7iu3u-0                  1/1       Running   0          11m
standard-authservice-576b747759-4g7ls   1/1       Running   0          13m

SAs are removed, as are the deployments specified in the CSV, the clusterroles, clusterrolebindings. Interestingly, contrary to the documentation crds are not deleted:

[keith@rhfed-localdomain enmasse_downstream]$ oc get crd -l app=enmasse NAME CREATED AT addressplans.admin.enmasse.io 2019-07-13T17:03:28Z addressspaceplans.admin.enmasse.io 2019-07-13T17:03:28Z authenticationservices.admin.enmasse.io 2019-07-13T17:03:28Z brokeredinfraconfigs.admin.enmasse.io 2019-07-13T17:03:28Z consoleservices.admin.enmasse.io 2019-07-13T17:03:28Z iotconfigs.iot.enmasse.io 2019-07-13T17:03:28Z standardinfraconfigs.admin.enmasse.io 2019-07-13T17:03:28Z

Nor are the apiservices:

oc get apiservices | grep enmass v1alpha1.admin.enmasse.io Local True 29m v1alpha1.iot.enmasse.io Local True 29m v1beta1.admin.enmasse.io Local True 29m v1beta2.admin.enmasse.io Local

k-wall commented 5 years ago

To workaround this issue, the user would need to have permissions to run these command in the openshift-operators namespace:

oc delete all --selector=app=enmasse
oc delete crd -l app=enmasse
oc delete apiservices -l app=enmasse
oc get cm  -l app=enmasse
oc get secret  -l app=enmasse
k-wall commented 5 years ago

I think to solve this issue completely, we would need to introduce a single cr resource representing EnMasse itself. This CR would be declare in the CSV, so OLM would know to remove the resource when the operator is deleted. We would then change the system to add metadata.ownerReferences onto created resources. Deletion of this resource would cascade deletes to all other EnMasse entities. I think there would be some design choices to be made, so I think this would be a reasonable significant piece of work. @rgodfrey any thoughts?

I notice we'd raised #2378

famarting commented 5 years ago

@k-wall what is the intention with this two commands??

oc get cm  -l app=enmasse
oc get secret  -l app=enmasse

shouldn't it be oc delete ??

k-wall commented 5 years ago

@famartinrh they should be oc delete.