strimzi / strimzi-kafka-operator

Apache Kafka® running on Kubernetes
https://strimzi.io/
Apache License 2.0
4.86k stars 1.3k forks source link

Watch all namespaces by TopicOperator / Namespaced topics #1206

Open seglo opened 5 years ago

seglo commented 5 years ago

Hi Strimzi team,

This is a proposal to support namespaced Kafka topics as part of a Kubernetes application's lifecycle. I tried to think of a way that was both simple and would not break backwards compatibility with how KafkaTopic resources are created and managed. It's possible that aspects of this proposal could be used for similar application-level concerns for KafkaUser resources as well, but I only discuss topics for the sake of simplicity.

Problem

In a multi-tenant Kubernetes cluster, it is a common practice to define authorization rules at a namespace level. It’s also a common practice for user applications to define all the Kubernetes resources they rely upon during installation. These resources could include Kafka topics. Topics are currently represented with the KafkaTopic CustomResourceDefinition (CRD) with a namespace scope.

The current version of Strimzi only allows KafkaTopic resources to be created in the same namespace that the Topic Operator is deployed to. The Topic Operator is installed into the same namespace that the Kafka resource is created (when topicOperator configuration is provided). When the Kafka resource is created in a target namespace that is watched by the Cluster Operator, then the Cluster Operator will deploy the Kafka and ZooKeeper StatefulSets, the Entity Operator deployment (which contains the Topic Operator as an optional container), and other resources into that target namespace.

The KafkaTopic.metadata.name represents the topic name within Kafka. The Topic Operator will watch for KafkaTopic resource CRUD operations and transform the operation into its corresponding operation against the underlying topic in Kafka. KafkaTopics created in other namespaces are ignored by the Topic Operator. If the Topic Operator watched for KafkaTopics in other namespaces then there is the potential to have two KafkaTopic resources with the same name created in two namespaces. This is possible because the KafkaTopic is a namespace scoped resource and not a cluster resource. Therefore a conflict could arise with the topic operator watching changes in two separate resources (Ex. Each KafkaTopic has different topic-level configuration, or one KafkaTopic is deleted).

Proposal

Create a new CustomResourceDefinition to handle topics that are created in namespaces, called KafkaNamespaceTopic. This CRD will allow the user to create a topic that will be automatically namespaced within Kafka by the Topic Operator. It will encode the actual topic name using the Kubernetes namespace the resource resides, and either the metadata.name or spec.topicName (if defined) of the instance of a KafkaNamespaceTopic.

{Kubernetes namespace name}.{metadata.name|spec.topicName}

NOTE: Kubernetes namespaces and metadata.names cannot use the character .

The KafkaNamespaceTopic will include all configuration that is available in a KafkaTopic.

The Kafka CRD will be extended to include an additional property in the TopicOperatorSpec specification that includes a YAML list value of all the namespaces that are watched for KafkaNamespaceTopic CRD’s, called watchNamespaceTopics.

TopicOperatorSpec.watchNamespaceTopics: [{Namespace 1},...,{Namespace N}]

When a KafkaNamespaceTopic is created in a watched namespace then the Topic Operator will perform the following actions.

When a KafkaNamespaceTopic is updated in a watched namespace then the Topic Operator will reconcile the changes to the topic in Kafka.

When a KafkaNamespaceTopic is deleted in a watched namespace then the Topic Operator will delete the topic in Kafka.

Unlike KafkaTopics, if changes to underlying Kafka topics are made out of band of the KafkaNamespaceTopic CRD instance then the operator will not synchronize those changes to the KafkaNamespaceTopic.

This proposal preserves the existing behaviour of KafkaTopic for backwards compatibility. Documentation updates can be made to recommend that the user use KafkaNamespaceTopic for any topics they want to manage within a Kubernetes application’s lifecycle and use KafkaTopic only for high level administration of topics within Kafka.

RBAC Configuration Management

Cluster Operator Install/Update Time

All RBAC configuration required to watch application namespaces will be generated when the Strimzi Kafka Operator is installed or updated. If using Helm as the means of installation then the Helm Chart would use a new configuration value that contains the superset of all application namespaces that can be watched for KafkaNamespaceTopics, for all Kafka clusters, called watchedTopicNamespaces. This is required so that we do not need to provide the Cluster Operator with privileges to manage RBAC configuration when CRUD operations are performed on Kafka resources that affect the TopicOperatorSpec.watchNamespaceTopics property.

When the Strimzi Helm Chart is installed then all the necessary RBAC configuration (Roles and RoleBindings) will be created for the superset of namespaces to watch for KafkaNamespaceTopics.

It is the administrator’s responsibility to update the Helm Chart’s watchedTopicNamespaces list as application namespaces are added and removed, or to add the appropriate RBAC configuration per application namespace using provided templates.

Ex. Configure Helm Chart values.yaml to create RBAC resources to allow the Topic Operator to watch several different namespaces.

watchedTopicNamespaces: [application_A,application_B]

Kafka Resource Install/Update Time

When installing or updating the Kafka resource, the TopicOperatorSpec.watchNamespaceTopics property could be used to further restrict which namespaces this Cluster’s Topic Operator will watch for KafkaNamespaceTopics.

Ex. Configure Kafka resource’s Topic Operator to only watch Namespace for App A.

TopicOperatorSpec.watchNamespaceTopics: [application_A]

If the TopicOperatorSpec.watchNamespaceTopics property is not defined then the superset of all application namespaces will be watched.

seglo commented 5 years ago

Linking to related issue #879. Highlighting possibly interested parties: @scholzj, @Tombar.

Tombar commented 5 years ago

I like the overall idea and the simplicity of the proposal about RBAC.

Regarding users resources, I believe the same approach could be very useful.

tombentley commented 5 years ago

@seglo thanks, this is an interesting proposal! I have a few questions/comments....

The current version of Strimzi only allows KafkaTopic resources to be created in the same namespace that the Topic Operator is deployed to.

Nitpick: That's not quite true, it can watch a different namespace via Kafka.spec.entityOperator.topicOperator.watchedNamespaces. But the point is take that it's still only a single namespace.

Your Problem section describes the current limitations of the TO, but it would be interesting to understand more about why that's problematic for your intended use case. I think the answer to that is simply: "I want a multi-tenant Kafka cluster, and the TO only being able to watch a single namespace prevents that because I want each tenant to be confined to its own namespace", but it would be good if you could confirm that, and any extra details.

Update the KafkaNamespaceTopic with the actual topic name (Ex. myproject.mytopic). A label could be added to the resource with the fully encoded name of the topic, strimzi.io/namespace-topic-name.

Using a status property seems like a better fit, imho.

This proposal preserves the existing behaviour of KafkaTopic for backwards compatibility. Documentation updates can be made to recommend that the user use KafkaNamespaceTopic for any topics they want to manage within a Kubernetes application’s lifecycle and use KafkaTopic only for high level administration of topics within Kafka.

So you have both a KafkaNamespacedTopic, which is unidirectionally synced, and also a KafkaTopic, which is bidirectionally synced. But for topics which are created in Kafka programatically, such as internal Kafka Streams topics, there will be no KafkaNamespacedTopic. Likewise, as I understand the proposal, although programatic changes to topic configuration would be reflected to the KafkaTopic they wouldn't be reflected in the KafkaNamespacedTopic. I can see people finding that situation confusing, though I'd not sure that much can be done to avoid it.

It is the administrator’s responsibility to update the Helm Chart’s watchedTopicNamespaces list as application namespaces are added and removed, or to add the appropriate RBAC configuration per application namespace using provided templates.

That's not really trivial. You also only really described the situation for a Helm-installed setup. Presumably the problem is similarly left to the admin if the operator is being installed manually, or via some other means.

One final question: Did you reject any other ideas which might have solved your problem?

seglo commented 5 years ago

Thanks for the review @tombentley !

Your Problem section describes the current limitations of the TO, but it would be interesting to understand more about why that's problematic for your intended use case. I think the answer to that is simply: "I want a multi-tenant Kafka cluster, and the TO only being able to watch a single namespace prevents that because I want each tenant to be confined to its own namespace", but it would be good if you could confirm that, and any extra details.

That's correct. I would like to have better topic support in a multi-tenant Kubernetes cluster where users (application developers) are sandboxed to their own namespace. I would like to provide these users with the ability to administer namespaced Kafka topics relevant to their application requirements so that a) They don't require privileges to modify shared infrastructure, like Strimzi/Kafka, b) Their topics don't conflict with other namespaces topics whether they're separate applications or deployments (dev, test, etc.)

Update the KafkaNamespaceTopic with the actual topic name (Ex. myproject.mytopic). A label could be added to the resource with the fully encoded name of the topic, strimzi.io/namespace-topic-name.

Using a status property seems like a better fit, imho.

Agreed.

This proposal preserves the existing behaviour of KafkaTopic for backward compatibility. Documentation updates can be made to recommend that the user use KafkaNamespaceTopic for any topics they want to manage within a Kubernetes application’s lifecycle and use KafkaTopic only for high level administration of topics within Kafka.

So you have both a KafkaNamespacedTopic, which is unidirectionally synced, and also a KafkaTopic, which is bidirectionally synced. But for topics which are created in Kafka programatically, such as internal Kafka Streams topics, there will be no KafkaNamespacedTopic. Likewise, as I understand the proposal, although programatic changes to topic configuration would be reflected to the KafkaTopic they wouldn't be reflected in the KafkaNamespacedTopic. I can see people finding that situation confusing, though I'd not sure that much can be done to avoid it.

You're right, it could be confusing. After further thought I suppose we could keep KafkaNamespacedTopic bidirectionally synced as well. My concern was that internal topics, or other user topics that have the same namespaced name by coincidence wouldn't have CRUD operations affect KafkaNamespacedTopics, but I guess you have to deal with similar issues with KafkaTopic.

How about keeping the KafkaNamespacedTopic synced only if it already exists in the namespace. If it doesn't exist then it wouldn't be created or updated. There are some other edge cases to consider, but perhaps that's good enough?

It is the administrator’s responsibility to update the Helm Chart’s watchedTopicNamespaces list as application namespaces are added and removed, or to add the appropriate RBAC configuration per application namespace using provided templates.

That's not really trivial. You also only really described the situation for a Helm-installed setup. Presumably the problem is similarly left to the admin if the operator is being installed manually, or via some other means.

I focused on Helm because that's my primary deployment model. I did consider non-Helm deploys, but I think I worded it poorly. For non-Helm deploys a set of templated Role and RoleBinding resources could be provided and instructions in the docs could tell the user how to apply them when they add new application namespaces. I'm not sure if there is an easier way here, Helm really makes this process easier. I suppose you would have the same problem regarding RBAC config for Cluster Operator watched namespaces, when you want to add new namespaces.

Another option is to create cluster scoped RBAC config for the Topic Operator, but I think that would require giving the Cluster Operator admin privileges in order to create that config when Topic Operators are deployed. I assume that's a non-starter.

One final question: Did you reject any other ideas which might have solved your problem?

Yes, I considered several different approaches.

  1. Change the KafkaTopic CRD to cluster scope. This would enforce global uniqueness of KafkaTopic resource names. However, if I understand correctly it would still require per namespace RBAC configuration, or higher level ClusterRole and ClusterRoleBinding config, which would require the cluster operator to have admin privileges to create (I feel like there may be a clever solution to this I haven't thought of). It would also make it not possible for multiple Strimzi Kafka clusters to share the same topic names when they're only named by their metadata.name. If another field was used, like KafkaTopic.spec.topicName you could work around this problem, but uniqueness would have to be enforced some other way. I'm not sure how the Topic Operator currently enforces this.

  2. Add an admission webhook to the Topic Operator which would reject KafkaTopics that represent a topic that already exists in Kafka. I'm not super familiar with webhooks, but I'm not sure how you would handle multiple Kafka clusters in this scenario. Also, the same RBAC concerns exist and would need to be handled in a similar way to my original proposal, unless you have another suggestion.

tombentley commented 5 years ago

You're right, it could be confusing. After further thought I suppose we could keep KafkaNamespacedTopic bidirectionally synced as well. My concern was that internal topics, or other user topics that have the same namespaced name by coincidence wouldn't have CRUD operations affect KafkaNamespacedTopics, but I guess you have to deal with similar issues with KafkaTopic.

How about keeping the KafkaNamespacedTopic synced only if it already exists in the namespace. If it doesn't exist then it wouldn't be created or updated. There are some other edge cases to consider, but perhaps that's good enough?

That might work. I guess the topics which are not synced to KafkaNamespacedTopic would still be synced to KafkaTopic, so there would still be a route to configuring them via kubernetes. But that would necessarily mean that each individual team couldn't use kube to configure those topics, since all those KafkaTopic would be in some common namespace.

Just hypothetically we could envisage a bidirectional sync of the KafkaNamespacedTopic when the Kafka topic has a name whose prefix is one of the watchedTopicNamespaces, and using some other fallback namespace if there was no matching namespace in watchedTopicNamespaces. That would be unambiguous. It would need to cope when the TO is reconfigured to watch a new namespace (to avoid leaving a dangling KafkaNamespacedTopic in the other namespace. It wouldn't cope so well if the TO were reconfigured to remove a namespace because while the new instance would create KafkaNamespacedTopic in the fallback namespace it would not remove the KafkaNamespacedTopic in the old namespace. I think this would work. If we had full bidi sync of KafkaNamespacedTopic we wouldn't need the shadow KafkaTopics.

seglo commented 5 years ago

You're right, it could be confusing. After further thought I suppose we could keep KafkaNamespacedTopic bidirectionally synced as well. My concern was that internal topics, or other user topics that have the same namespaced name by coincidence wouldn't have CRUD operations affect KafkaNamespacedTopics, but I guess you have to deal with similar issues with KafkaTopic. How about keeping the KafkaNamespacedTopic synced only if it already exists in the namespace. If it doesn't exist then it wouldn't be created or updated. There are some other edge cases to consider, but perhaps that's good enough?

That might work. I guess the topics which are not synced to KafkaNamespacedTopic would still be synced to KafkaTopic, so there would still be a route to configuring them via kubernetes. But that would necessarily mean that each individual team couldn't use kube to configure those topics, since all those KafkaTopic would be in some common namespace.

Just hypothetically we could envisage a bidirectional sync of the KafkaNamespacedTopic when the Kafka topic has a name whose prefix is one of the watchedTopicNamespaces, and using some other fallback namespace if there was no matching namespace in watchedTopicNamespaces.

I'm not sure I like the idea of a fallback namespace. If the goal is just to give users a way of configuring already namespaced Kafka topics through K8s then I think giving access to KafkaTopic is good enough. If we want to give users fine-grained control to changing namespaced Kafka topics (using KafkaNamespacedTopic) then we could document a requirement that a namespace already exists and is in watchedTopicNamespaces.

That would be unambiguous. It would need to cope when the TO is reconfigured to watch a new namespace (to avoid leaving a dangling KafkaNamespacedTopic in the other namespace.

In the situation where namespaced topics exist in Kafka before the K8s namespace exists and/or is added to watchedTopicNamespaces, then the Topic Operator could create missing KafkaNamespacedTopics.

It wouldn't cope so well if the TO were reconfigured to remove a namespace because while the new instance would create KafkaNamespacedTopic in the fallback namespace it would not remove the KafkaNamespacedTopic in the old namespace.

Right, removing a namespace from watchedTopicNamespaces would leave orphaned KafkaNamespacedTopics. I'm not sure of a clean way to handle this. Since watchedTopicNamespaces is updated by the K8s admin then maybe it's good enough to document this use case. I would expect that the usual result of removing a watched namespace is to delete the namespace anyway, so maybe it's good enough to just document as a caveat. We could explore Helm hooks that could handle delete usecases, or a templates for Open Shift.

I think this would work. If we had full bidi sync of KafkaNamespacedTopic we wouldn't need the shadow KafkaTopics.

KamalAman commented 4 years ago

Wonderful proposal! This is precisely the workflow I'm looking for a multi-tenant architecture. Now, 16 months later, I really hope this gets implemented at some point.

jacobcrawford commented 3 years ago

Longshot, but is this still a feature that is needed in the community, or have other solutions solved the problem?

scholzj commented 2 years ago

Triaged on 26.5.2022: This would be great feature. A proposal will be needed to clarify all the caveats and possible error situations.

luisdavim commented 2 years ago

Hi, we're not fully using Strimzi yet but we were already following the pattern of using <namespace_name>.<topic_name> for our topics, the reason we did this, even before using Strimzi is that we had a single MSK cluster for all non-prod environments and started using that naming convention to avoid collisions, what's being proposed in this issue would fit perfectly with our use case.

Another thing is that we want to start grouping our apps into namespaces based on business domains, and this would allow having the topics for a domain in the same namespace as the apps that produce them.

smoke commented 1 year ago

I believe similar approach like https://github.com/strimzi/strimzi-kafka-operator/issues/5895#issuecomment-1422080151 may be applied as a workaround of this case

chary1112004 commented 1 year ago

We are also interested in this feature as topic should be created inside manifest of tenant.

siegenthalerroger commented 1 year ago

Having read the proposal, I think an overseen possibility is that a topic may well be "owned" by a certain namespace, but may want to explicitly allow users of other namespaces to read and/or write. That would add some level of complexity but it is something that I do think is applicable in general.

Example case: