crossplane-contrib / provider-kafka

Crossplane provider for Kafka
Apache License 2.0
27 stars 30 forks source link

ACLs cannot be deleted #39

Closed rtoma closed 1 year ago

rtoma commented 2 years ago

What happened?

Last weeks I've been working on engineering a Kafka GitOps feature for our MSK clusters. For this I use Crossplane and this provider. To make the provider work with MSK I've contributed TLS + SCRAM authentication support.

Now managing Topics works great. Creation of ACLs works too, but deletion is not possible. To make this work I changed:

https://github.com/crossplane-contrib/provider-kafka/blob/d085e96353054c2807f37c86435130b8a5569f5c/internal/clients/kafka/acl/acl.go#L59-L68

... into:

        resp, err := cl.DescribeACLs(ctx, ab)
        if err != nil {
                return nil, errors.Wrap(err, "describe ACLs response is empty")
        }
        if exists := resp[0].Described; len(exists) == 0 {
                return nil, nil // no matching ACLs found
        }

The original code throws an error if no ACLs exist for specific criteria. My code allows this.

Now my code works flawlessly (for MSK), but since it is a significant change to the logic and it implies 'delete ACL' never worked, I wonder if I am missing something. So, I'd like a discussion before I submit a patch.

Cheers.

How can we reproduce it?

In short:

In detail:

I create the ACL. Here is the resource and its good health:

$ kubectl get accesscontrollist.acl.kafka.crossplane.io
NAME                                               READY   SYNCED   EXTERNAL-NAME                                                                                                                                                                                                                                      AGE
acl-managed-by-crossplane-kafka-provider-acltest   True    True     {"ResourceName":"topic-managed-by-crossplane-kafka-provider-acltest","ResourceType":"Topic","ResourcePrincipal":"User:Foo","ResourceHost":"*","ResourceOperation":"Read","ResourcePermissionType":"Allow","ResourcePatternTypeFilter":"Literal"}   92s

With kcl I check the Kafka side of things:

$ kcl admin acl describe --type any --pattern match --op any --perm any --name topic-managed-by-crossplane-kafka-provider-acltest
TYPE   NAME                                                PATTERN  PRINCIPAL                                                           HOST  OPERATION  PERMISSION  ERROR  ERROR MESSAGE
TOPIC  topic-managed-by-crossplane-kafka-provider-acltest  LITERAL  User:Foo                                                            *     READ       ALLOW

So, indeed: the ACL exists.

Now let's delete the ACL:

$ kubectl delete accesscontrollist.acl.kafka.crossplane.io/acl-managed-by-crossplane-kafka-provider-acltest
accesscontrollist.acl.kafka.crossplane.io "acl-managed-by-crossplane-kafka-provider-acltest" deleted
<hangs>

The delete command hangs. Meanwhile in Kafka the ACL has been removed.

Checking the to-be-deleted ACL resource from another terminal shows it's now unREADY and unSYNCED. Both as expected:

$ k get accesscontrollist.acl.kafka.crossplane.io
NAME                                               READY   SYNCED   EXTERNAL-NAME                                                                                                                                                                                                                                      AGE
acl-managed-by-crossplane-kafka-provider-acltest   False   False    {"ResourceName":"topic-managed-by-crossplane-kafka-provider-acltest","ResourceType":"Topic","ResourcePrincipal":"User:Foo","ResourceHost":"*","ResourceOperation":"Read","ResourcePermissionType":"Allow","ResourcePatternTypeFilter":"Literal"}   16m

Checking the kafka provider (running in debug mode) logs I see this:

2022-07-13T11:43:26.954+0200    DEBUG   provider-kafka  Cannot observe external resource    {"controller": "managed/accesscontrollist.acl.kafka.crossplane.io", "request": "/acl-managed-by-crossplane-kafka-provider-acltest", "uid": "90943fdb-d5d6-44ab-83b6-77f9dc66a15a", "version": "179916", "external-name": "{\"ResourceName\":\"topic-managed-by-crossplane-kafka-provider-acltest\",\"ResourceType\":\"Topic\",\"ResourcePrincipal\":\"User:Foo\",\"ResourceHost\":\"*\",\"ResourceOperation\":\"Read\",\"ResourcePermissionType\":\"Allow\",\"ResourcePatternTypeFilter\":\"Literal\"}", "error": "cannot List ACLs: no create response for acl", "errorVerbose": "no create response for acl\ngithub.com/crossplane-contrib/provider-kafka/internal/clients/kafka/acl.List\n\t/Users/rtoma/redacted/projects/provider-kafka-fork/internal/clients/kafka/acl/acl.go:66\ngithub.com/crossplane-contrib/provider-kafka/internal/controller/acl.(*external).Observe\n\t/Users/rtoma/redacted/projects/provider-kafka-fork/internal/controller/acl/acl.go:158\ngithub.com/crossplane/crossplane-runtime/pkg/reconciler/managed.(*Reconciler).Reconcile\n\t/Users/rtoma/go/pkg/mod/github.com/crossplane/crossplane-runtime@v0.15.0/pkg/reconciler/managed/reconciler.go:620\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/Users/rtoma/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.9.6/pkg/internal/controller/controller.go:298\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/Users/rtoma/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.9.6/pkg/internal/controller/controller.go:253\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/Users/rtoma/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.9.6/pkg/internal/controller/controller.go:214\nruntime.goexit\n\t/opt/homebrew/Cellar/go/1.18.3/libexec/src/runtime/asm_arm64.s:1263\ncannot List ACLs\ngithub.com/crossplane-contrib/provider-kafka/internal/controller/acl.(*external).Observe\n\t/Users/rtoma/redacted/projects/provider-kafka-fork/internal/controller/acl/acl.go:161\ngithub.com/crossplane/crossplane-runtime/pkg/reconciler/managed.(*Reconciler).Reconcile\n\t/Users/rtoma/go/pkg/mod/github.com/crossplane/crossplane-runtime@v0.15.0/pkg/reconciler/managed/reconciler.go:620\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/Users/rtoma/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.9.6/pkg/internal/controller/controller.go:298\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/Users/rtoma/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.9.6/pkg/internal/controller/controller.go:253\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/Users/rtoma/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.9.6/pkg/internal/controller/controller.go:214\nruntime.goexit\n\t/opt/homebrew/Cellar/go/1.18.3/libexec/src/runtime/asm_arm64.s:1263"}
2022-07-13T11:43:26.955+0200    DEBUG   controller-runtime.manager.events   Warning {"object": {"kind":"AccessControlList","name":"acl-managed-by-crossplane-kafka-provider-acltest","uid":"90943fdb-d5d6-44ab-83b6-77f9dc66a15a","apiVersion":"acl.kafka.crossplane.io/v1alpha1","resourceVersion":"179916"}, "reason": "CannotObserveExternalResource", "message": "cannot List ACLs: no create response for acl"}

From above debug blob I'd like to highlight:

"errorVerbose": "no create response for acl
  github.com/crossplane-contrib/provider-kafka/internal/clients/kafka/acl.List
  /Users/rtoma/redacted/projects/provider-kafka-fork/internal/clients/kafka/acl/acl.go:66

This is why I believe 'delete ACL' is flawed. The acl.List method throws an error when no ACLs exist. Now to me finding no matching ACLs seems like the expected result of a delete ACL action. But maybe I'm missing something?

What environment did it happen in?

Crossplane version: 1.8.1 Kafka provider: 0.1.0 with TLS/SCRAM support Kubernetes: 1.22.8 (OpenShift on AWS)

marshmallory commented 2 years ago

Thanks for putting this together. Just from reading your documentation, I think you're right, that it's flawed and is a bug, and does need to be fixed.

We did test it and the delete was working, so now I'm wondering if maybe the code didn't get merged. I'll take a look either way. I'll implement your change and test it locally.

Just as a FYI: At this time, we're not using this in any actual environments, so I am absolutely sure there will be edge cases we missed. We're spending most of our time at the moment operationalizing Crossplane for our organization. I'm hoping to get this into production in our space later this year. :)

marshmallory commented 1 year ago

Fixed by #58