Open seglo opened 5 years ago
Linking to related issue #879. Highlighting possibly interested parties: @scholzj, @Tombar.
I like the overall idea and the simplicity of the proposal about RBAC.
Regarding users resources, I believe the same approach could be very useful.
@seglo thanks, this is an interesting proposal! I have a few questions/comments....
The current version of Strimzi only allows KafkaTopic resources to be created in the same namespace that the Topic Operator is deployed to.
Nitpick: That's not quite true, it can watch a different namespace via Kafka.spec.entityOperator.topicOperator.watchedNamespaces
. But the point is take that it's still only a single namespace.
Your Problem section describes the current limitations of the TO, but it would be interesting to understand more about why that's problematic for your intended use case. I think the answer to that is simply: "I want a multi-tenant Kafka cluster, and the TO only being able to watch a single namespace prevents that because I want each tenant to be confined to its own namespace", but it would be good if you could confirm that, and any extra details.
Update the KafkaNamespaceTopic with the actual topic name (Ex. myproject.mytopic). A label could be added to the resource with the fully encoded name of the topic, strimzi.io/namespace-topic-name.
Using a status
property seems like a better fit, imho.
This proposal preserves the existing behaviour of KafkaTopic for backwards compatibility. Documentation updates can be made to recommend that the user use KafkaNamespaceTopic for any topics they want to manage within a Kubernetes application’s lifecycle and use KafkaTopic only for high level administration of topics within Kafka.
So you have both a KafkaNamespacedTopic
, which is unidirectionally synced, and also a KafkaTopic
, which is bidirectionally synced. But for topics which are created in Kafka programatically, such as internal Kafka Streams topics, there will be no KafkaNamespacedTopic
. Likewise, as I understand the proposal, although programatic changes to topic configuration would be reflected to the KafkaTopic
they wouldn't be reflected in the KafkaNamespacedTopic
. I can see people finding that situation confusing, though I'd not sure that much can be done to avoid it.
It is the administrator’s responsibility to update the Helm Chart’s watchedTopicNamespaces list as application namespaces are added and removed, or to add the appropriate RBAC configuration per application namespace using provided templates.
That's not really trivial. You also only really described the situation for a Helm-installed setup. Presumably the problem is similarly left to the admin if the operator is being installed manually, or via some other means.
One final question: Did you reject any other ideas which might have solved your problem?
Thanks for the review @tombentley !
Your Problem section describes the current limitations of the TO, but it would be interesting to understand more about why that's problematic for your intended use case. I think the answer to that is simply: "I want a multi-tenant Kafka cluster, and the TO only being able to watch a single namespace prevents that because I want each tenant to be confined to its own namespace", but it would be good if you could confirm that, and any extra details.
That's correct. I would like to have better topic support in a multi-tenant Kubernetes cluster where users (application developers) are sandboxed to their own namespace. I would like to provide these users with the ability to administer namespaced Kafka topics relevant to their application requirements so that a) They don't require privileges to modify shared infrastructure, like Strimzi/Kafka, b) Their topics don't conflict with other namespaces topics whether they're separate applications or deployments (dev, test, etc.)
Update the KafkaNamespaceTopic with the actual topic name (Ex. myproject.mytopic). A label could be added to the resource with the fully encoded name of the topic, strimzi.io/namespace-topic-name.
Using a
status
property seems like a better fit, imho.
Agreed.
This proposal preserves the existing behaviour of KafkaTopic for backward compatibility. Documentation updates can be made to recommend that the user use KafkaNamespaceTopic for any topics they want to manage within a Kubernetes application’s lifecycle and use KafkaTopic only for high level administration of topics within Kafka.
So you have both a
KafkaNamespacedTopic
, which is unidirectionally synced, and also aKafkaTopic
, which is bidirectionally synced. But for topics which are created in Kafka programatically, such as internal Kafka Streams topics, there will be noKafkaNamespacedTopic
. Likewise, as I understand the proposal, although programatic changes to topic configuration would be reflected to theKafkaTopic
they wouldn't be reflected in theKafkaNamespacedTopic
. I can see people finding that situation confusing, though I'd not sure that much can be done to avoid it.
You're right, it could be confusing. After further thought I suppose we could keep KafkaNamespacedTopic
bidirectionally synced as well. My concern was that internal topics, or other user topics that have the same namespaced name by coincidence wouldn't have CRUD operations affect KafkaNamespacedTopics
, but I guess you have to deal with similar issues with KafkaTopic
.
How about keeping the KafkaNamespacedTopic
synced only if it already exists in the namespace. If it doesn't exist then it wouldn't be created or updated. There are some other edge cases to consider, but perhaps that's good enough?
It is the administrator’s responsibility to update the Helm Chart’s watchedTopicNamespaces list as application namespaces are added and removed, or to add the appropriate RBAC configuration per application namespace using provided templates.
That's not really trivial. You also only really described the situation for a Helm-installed setup. Presumably the problem is similarly left to the admin if the operator is being installed manually, or via some other means.
I focused on Helm because that's my primary deployment model. I did consider non-Helm deploys, but I think I worded it poorly. For non-Helm deploys a set of templated Role
and RoleBinding
resources could be provided and instructions in the docs could tell the user how to apply them when they add new application namespaces. I'm not sure if there is an easier way here, Helm really makes this process easier. I suppose you would have the same problem regarding RBAC config for Cluster Operator watched namespaces, when you want to add new namespaces.
Another option is to create cluster scoped RBAC config for the Topic Operator, but I think that would require giving the Cluster Operator admin privileges in order to create that config when Topic Operators are deployed. I assume that's a non-starter.
One final question: Did you reject any other ideas which might have solved your problem?
Yes, I considered several different approaches.
Change the KafkaTopic
CRD to cluster scope. This would enforce global uniqueness of KafkaTopic
resource names. However, if I understand correctly it would still require per namespace RBAC configuration, or higher level ClusterRole
and ClusterRoleBinding
config, which would require the cluster operator to have admin privileges to create (I feel like there may be a clever solution to this I haven't thought of). It would also make it not possible for multiple Strimzi Kafka clusters to share the same topic names when they're only named by their metadata.name
. If another field was used, like KafkaTopic.spec.topicName
you could work around this problem, but uniqueness would have to be enforced some other way. I'm not sure how the Topic Operator currently enforces this.
Add an admission webhook to the Topic Operator which would reject KafkaTopics
that represent a topic that already exists in Kafka. I'm not super familiar with webhooks, but I'm not sure how you would handle multiple Kafka clusters in this scenario. Also, the same RBAC concerns exist and would need to be handled in a similar way to my original proposal, unless you have another suggestion.
You're right, it could be confusing. After further thought I suppose we could keep
KafkaNamespacedTopic
bidirectionally synced as well. My concern was that internal topics, or other user topics that have the same namespaced name by coincidence wouldn't have CRUD operations affectKafkaNamespacedTopics
, but I guess you have to deal with similar issues withKafkaTopic
.How about keeping the
KafkaNamespacedTopic
synced only if it already exists in the namespace. If it doesn't exist then it wouldn't be created or updated. There are some other edge cases to consider, but perhaps that's good enough?
That might work. I guess the topics which are not synced to KafkaNamespacedTopic
would still be synced to KafkaTopic
, so there would still be a route to configuring them via kubernetes. But that would necessarily mean that each individual team couldn't use kube to configure those topics, since all those KafkaTopic
would be in some common namespace.
Just hypothetically we could envisage a bidirectional sync of the KafkaNamespacedTopic
when the Kafka topic has a name whose prefix is one of the watchedTopicNamespaces
, and using some other fallback namespace if there was no matching namespace in watchedTopicNamespaces
. That would be unambiguous. It would need to cope when the TO is reconfigured to watch a new namespace (to avoid leaving a dangling KafkaNamespacedTopic
in the other namespace. It wouldn't cope so well if the TO were reconfigured to remove a namespace because while the new instance would create KafkaNamespacedTopic
in the fallback namespace it would not remove the KafkaNamespacedTopic
in the old namespace. I think this would work. If we had full bidi sync of KafkaNamespacedTopic
we wouldn't need the shadow KafkaTopic
s.
You're right, it could be confusing. After further thought I suppose we could keep
KafkaNamespacedTopic
bidirectionally synced as well. My concern was that internal topics, or other user topics that have the same namespaced name by coincidence wouldn't have CRUD operations affectKafkaNamespacedTopics
, but I guess you have to deal with similar issues withKafkaTopic
. How about keeping theKafkaNamespacedTopic
synced only if it already exists in the namespace. If it doesn't exist then it wouldn't be created or updated. There are some other edge cases to consider, but perhaps that's good enough?That might work. I guess the topics which are not synced to
KafkaNamespacedTopic
would still be synced toKafkaTopic
, so there would still be a route to configuring them via kubernetes. But that would necessarily mean that each individual team couldn't use kube to configure those topics, since all thoseKafkaTopic
would be in some common namespace.Just hypothetically we could envisage a bidirectional sync of the
KafkaNamespacedTopic
when the Kafka topic has a name whose prefix is one of thewatchedTopicNamespaces
, and using some other fallback namespace if there was no matching namespace inwatchedTopicNamespaces
.
I'm not sure I like the idea of a fallback namespace. If the goal is just to give users a way of configuring already namespaced Kafka topics through K8s then I think giving access to KafkaTopic
is good enough. If we want to give users fine-grained control to changing namespaced Kafka topics (using KafkaNamespacedTopic
) then we could document a requirement that a namespace already exists and is in watchedTopicNamespaces
.
That would be unambiguous. It would need to cope when the TO is reconfigured to watch a new namespace (to avoid leaving a dangling
KafkaNamespacedTopic
in the other namespace.
In the situation where namespaced topics exist in Kafka before the K8s namespace exists and/or is added to watchedTopicNamespaces
, then the Topic Operator could create missing KafkaNamespacedTopics
.
It wouldn't cope so well if the TO were reconfigured to remove a namespace because while the new instance would create
KafkaNamespacedTopic
in the fallback namespace it would not remove theKafkaNamespacedTopic
in the old namespace.
Right, removing a namespace from watchedTopicNamespaces
would leave orphaned KafkaNamespacedTopics
. I'm not sure of a clean way to handle this. Since watchedTopicNamespaces
is updated by the K8s admin then maybe it's good enough to document this use case. I would expect that the usual result of removing a watched namespace is to delete the namespace anyway, so maybe it's good enough to just document as a caveat. We could explore Helm hooks that could handle delete usecases, or a templates for Open Shift.
I think this would work. If we had full bidi sync of
KafkaNamespacedTopic
we wouldn't need the shadowKafkaTopic
s.
Wonderful proposal! This is precisely the workflow I'm looking for a multi-tenant architecture. Now, 16 months later, I really hope this gets implemented at some point.
Longshot, but is this still a feature that is needed in the community, or have other solutions solved the problem?
Triaged on 26.5.2022: This would be great feature. A proposal will be needed to clarify all the caveats and possible error situations.
Hi, we're not fully using Strimzi yet but we were already following the pattern of using <namespace_name>.<topic_name>
for our topics, the reason we did this, even before using Strimzi is that we had a single MSK cluster for all non-prod environments and started using that naming convention to avoid collisions, what's being proposed in this issue would fit perfectly with our use case.
Another thing is that we want to start grouping our apps into namespaces based on business domains, and this would allow having the topics for a domain in the same namespace as the apps that produce them.
I believe similar approach like https://github.com/strimzi/strimzi-kafka-operator/issues/5895#issuecomment-1422080151 may be applied as a workaround of this case
We are also interested in this feature as topic should be created inside manifest of tenant.
Having read the proposal, I think an overseen possibility is that a topic may well be "owned" by a certain namespace, but may want to explicitly allow users of other namespaces to read and/or write. That would add some level of complexity but it is something that I do think is applicable in general.
Example case:
Hi Strimzi team,
This is a proposal to support namespaced Kafka topics as part of a Kubernetes application's lifecycle. I tried to think of a way that was both simple and would not break backwards compatibility with how
KafkaTopic
resources are created and managed. It's possible that aspects of this proposal could be used for similar application-level concerns forKafkaUser
resources as well, but I only discuss topics for the sake of simplicity.Problem
In a multi-tenant Kubernetes cluster, it is a common practice to define authorization rules at a namespace level. It’s also a common practice for user applications to define all the Kubernetes resources they rely upon during installation. These resources could include Kafka topics. Topics are currently represented with the
KafkaTopic
CustomResourceDefinition (CRD) with a namespace scope.The current version of Strimzi only allows
KafkaTopic
resources to be created in the same namespace that the Topic Operator is deployed to. The Topic Operator is installed into the same namespace that the Kafka resource is created (when topicOperator configuration is provided). When the Kafka resource is created in a target namespace that is watched by the Cluster Operator, then the Cluster Operator will deploy the Kafka and ZooKeeper StatefulSets, the Entity Operator deployment (which contains the Topic Operator as an optional container), and other resources into that target namespace.The
KafkaTopic.metadata.name
represents the topic name within Kafka. The Topic Operator will watch forKafkaTopic
resource CRUD operations and transform the operation into its corresponding operation against the underlying topic in Kafka.KafkaTopics
created in other namespaces are ignored by the Topic Operator. If the Topic Operator watched forKafkaTopics
in other namespaces then there is the potential to have twoKafkaTopic
resources with the same name created in two namespaces. This is possible because theKafkaTopic
is a namespace scoped resource and not a cluster resource. Therefore a conflict could arise with the topic operator watching changes in two separate resources (Ex. EachKafkaTopic
has different topic-level configuration, or oneKafkaTopic
is deleted).Proposal
Create a new
CustomResourceDefinition
to handle topics that are created in namespaces, calledKafkaNamespaceTopic
. This CRD will allow the user to create a topic that will be automatically namespaced within Kafka by the Topic Operator. It will encode the actual topic name using the Kubernetes namespace the resource resides, and either themetadata.name
orspec.topicName
(if defined) of the instance of aKafkaNamespaceTopic
.The
KafkaNamespaceTopic
will include all configuration that is available in aKafkaTopic
.The
Kafka
CRD will be extended to include an additional property in the TopicOperatorSpec specification that includes a YAML list value of all the namespaces that are watched for KafkaNamespaceTopic CRD’s, calledwatchNamespaceTopics
.When a
KafkaNamespaceTopic
is created in a watched namespace then the Topic Operator will perform the following actions.KafkaNamespaceTopic
with the actual topic name (Ex.myproject.mytopic
). A label could be added to the resource with the fully encoded name of the topic,strimzi.io/namespace-topic-name
.ZkTopicWatcher
will detect a new topic was created and create the associatedKafkaTopic
.When a
KafkaNamespaceTopic
is updated in a watched namespace then the Topic Operator will reconcile the changes to the topic in Kafka.When a
KafkaNamespaceTopic
is deleted in a watched namespace then the Topic Operator will delete the topic in Kafka.Unlike
KafkaTopics
, if changes to underlying Kafka topics are made out of band of theKafkaNamespaceTopic
CRD instance then the operator will not synchronize those changes to theKafkaNamespaceTopic
.This proposal preserves the existing behaviour of
KafkaTopic
for backwards compatibility. Documentation updates can be made to recommend that the user useKafkaNamespaceTopic
for any topics they want to manage within a Kubernetes application’s lifecycle and useKafkaTopic
only for high level administration of topics within Kafka.RBAC Configuration Management
Cluster Operator Install/Update Time
All RBAC configuration required to watch application namespaces will be generated when the Strimzi Kafka Operator is installed or updated. If using Helm as the means of installation then the Helm Chart would use a new configuration value that contains the superset of all application namespaces that can be watched for
KafkaNamespaceTopics
, for all Kafka clusters, calledwatchedTopicNamespaces
. This is required so that we do not need to provide the Cluster Operator with privileges to manage RBAC configuration when CRUD operations are performed on Kafka resources that affect theTopicOperatorSpec.watchNamespaceTopics
property.When the Strimzi Helm Chart is installed then all the necessary RBAC configuration (
Roles
andRoleBindings
) will be created for the superset of namespaces to watch forKafkaNamespaceTopics
.It is the administrator’s responsibility to update the Helm Chart’s
watchedTopicNamespaces
list as application namespaces are added and removed, or to add the appropriate RBAC configuration per application namespace using provided templates.Ex. Configure Helm Chart values.yaml to create RBAC resources to allow the Topic Operator to watch several different namespaces.
Kafka Resource Install/Update Time
When installing or updating the
Kafka
resource, theTopicOperatorSpec.watchNamespaceTopics
property could be used to further restrict which namespaces this Cluster’s Topic Operator will watch forKafkaNamespaceTopics
.Ex. Configure
Kafka
resource’s Topic Operator to only watch Namespace for App A.If the
TopicOperatorSpec.watchNamespaceTopics
property is not defined then the superset of all application namespaces will be watched.