kudobuilder / kudo

Kubernetes Universal Declarative Operator (KUDO)
https://kudo.dev
Apache License 2.0
1.18k stars 101 forks source link

Toggle Task can not be used for uninstalled custom resources #1547

Closed ANeumann82 closed 4 years ago

ANeumann82 commented 4 years ago

What happened: Tried to change a resource deployment to use a Toggle Task. The resource is a custom resource for a CRD that may not exist on the cluster. KUDO failed with:

06
          +        - message: 'A transient error when executing task deploy.nodes.pre-node.monitor-deploy.
16:17:06
          +            Will retry. failed to determine if object &{map[apiVersion:monitoring.coreos.com/v1
16:17:06
          +            kind:ServiceMonitor metadata:map[annotations:map[kudo.dev/last-plan-execution-uid:6521e1bf-e0c4-45ab-a295-356198b3557b
16:17:06
          +            kudo.dev/phase:nodes kudo.dev/plan:deploy kudo.dev/step:pre-node] labels:map[app:prometheus-operator
16:17:06
          +            heritage:kudo kudo.dev/instance:cassandra kudo.dev/operator:cassandra
16:17:06
          +            release:prometheus-kubeaddons] name:cassandra-monitor namespace:cassandra-install-test]
16:17:06
          +            spec:map[endpoints:[map[interval:30s port:prometheus-exporter-port]] namespaceSelector:map[matchNames:[cassandra-install-test]]
16:17:06
          +            selector:map[matchLabels:map[kudo.dev/instance:cassandra kudo.dev/servicemonitor:true]]]]}
16:17:06
          +            is namespaced: a resource with GVK monitoring.coreos.com/v1, Kind=ServiceMonitor
16:17:06
          +            seems to be missing in API resource list'

What you expected to happen: The toggle task should be able to "delete" or not deploy a custom resource for which the CRD is not known to the cluster

How to reproduce it (as minimally and precisely as possible): Use a toggle task with a custom resource which is not known to the cluster.

Anything else we need to know?: The task_delete.go uses the enhancer which tries to determine if the resource to deploy is namespaced or not - which fails for an unknown custom resource.

zmalik commented 4 years ago

The toggle task should be able to "delete" or not deploy a custom resource for which the CRD is not known to the cluster

that is a failed requirement and I think it should fail if there is a CRD required for an operator installation that should be fixed independent of if its a Toggle task or not.

ANeumann82 commented 4 years ago

Well, if the parameter for the Toggle Task is "false", then the CRD is not required to install the operator, correct? And in this case the toggle tasks should simply do nothing and not fail the execution because it doesn't know about the CR that it wants to delete.

zmalik commented 4 years ago

got it! that would be a slightly different delete task. Would that be the same expectation from Delete task or just Toggle task?

ANeumann82 commented 4 years ago

Good question. My gut feeling says that it should only be like that for the Toggle task, as it depends on a parameter and is often used to enable/disable deployment of resources - I'm not fully sure what the DeleteTask would be used for, but I'd expect it to fail if one tried to delete an unknown custom resource.

But I wouldn't mind otherwise as well