strimzi / strimzi-kafka-operator

Apache Kafka® running on Kubernetes
https://strimzi.io/
Apache License 2.0
4.77k stars 1.28k forks source link

Im looking for a documented example of the sequence of yamls that should be installed in bootstrapping strimzi. #7051

Closed davesargrad closed 2 years ago

davesargrad commented 2 years ago

I've spent the morning trying to determine the best means to move from deploying strimzi in development to deploying strimzi in production.

Our goal is to deploy using installation artifacts.

There is a lot of great documentation on that referenced page, and there is a ton of discussion about "options" that must be selected, but a ton of options create confusion if there is not also a cohesive description that says: "lets say you choose these options, then the installation looks exactly like this". In other words there are some common deployment scenarios that can perhaps be documented in order to help people who are trying to figure out the impact of a desired architectural approach.

In development I used the "latest" strimzi yaml (which has everything in one file) and voila in minutes had a strimzi operator up and running in a single namespace. That operator watched for Kafka resources in the same namespace. Minutes later I had a fully operational kafka cluster.. yay.

In my production environment I want a strimzi operator in the kafka-operator namespace, and I want two strimzi clusters one in the devel namespace, and the other in the oper namespace.

So I'm trying to figure out exactly the sequence of yamls that I must work with: to get crds installed, operator installed, role and service account related stuff installed, and finally to get each kafka cluster installed

Unfortunately though many (yaml) pieces of the strimzi orchestration configuration are well documented, I dont see many, if any, "whole-scenario" orchestration examples.

So I'm looking for such "named scenarios" that are documented from the top down with all the bits and pieces that are required.

For example, I might think that the sequence is:

When I look at the documented decision process of watching one namespace, multiple namespaces, or all namespaces, I see pieces of the overall process that break out the operator Deployment (since it must now watch a list of namespaces), image

and the rolebindings that must get instantiated within each namespace. image

But then I dont see how to install the other pieces (e.g. CRDS).

In the following area I see many yamls that are clearly examples that would be used for given deployment configurations, but how do I figure out which of these are needed for my scenario? https://github.com/strimzi/strimzi-kafka-operator/tree/main/install/cluster-operator

I'm sure I'm simply missing some key page or pages that provide this kind of "whole scenario installation example" description.

Without this it seems like its very difficult to figure out which yamls are needed for a given deployment.

Any guidance here would be hugely appreciated. My colleague and I have read and experimented and discussed this challenge for the better part of a day. He and I are looking for that missing documentation glue that we feel we need.

scholzj commented 2 years ago

This might be more topic for a discussion on how to improve things rather than an issue.

I think the path the docs currently expect (see https://strimzi.io/docs/operators/latest/full/deploying.html#cluster-operator-str) is basically:

(See for example https://strimzi.io/docs/operators/latest/full/deploying.html#deploying-cluster-operator-to-watch-multiple-namespaces-str which is similar to your use-case)

Maybe there is a better way to do it - or something more convenient for most users. We are definitely open to discussion or suggestions.

Not every user does it this way. I know some for example use the GitOps and commit the YAMLs into their own Git repositories etc. I think that in general, a lot of users adapt it to whatever they use and the usual process to install apps.

But then I dont see how to install the other pieces (e.g. CRDS).

The CRDs are part of the installation ZIP. They are also part of the strimzi-cluster-operator-0.29.0.yaml YAML file. The strimzi-crds-0.29.0.yaml is there mostly for convenience when you want only the CRDs (e.g. for some testing) or when you want to install them first (some users do that). But in the docs, the kubectl create -f install/cluster-operator ... command installs the CRDs.

davesargrad commented 2 years ago

@scholzj

I see.. so you are saying I would install everything in install/cluster-operator? if so, i'm simply good with that. I didnt actually see that stated in the documentation.

Also what order would I install those yamls in? Is that documented from top to bottom? If order doesnt matter.. yay!

I did see the example you linked, but that only focused on two or three files inside install/cluster-operator. So I was left with the question.. "what about all the other files, what order do i install them in"?

I saw/see this as nothing more than a documentation issue.

scholzj commented 2 years ago

Also what order would I install those yamls in? Is that documented from top to bottom?

You should be basically just able to follow the order as listed here: https://strimzi.io/docs/operators/latest/full/deploying.html#deploying-cluster-operator-to-watch-multiple-namespaces-str

Kubernetes is quite tolerant. So for example you can install a ClusterRole which gives access to the Kafka custom resource before installing the CRD with this resource or RoleBinding for ClusterRole which does not exist yet. So basically it is all eventually consistent. The worst thing which can normally happen is that for example the operator pod crashloops until you install the RoleBindings in the watched namespaces, but it should recover once you install them.

I did see the example you linked, but that only focused on two or three files inside install/cluster-operator. So I was left with the question.. "what about all the other files, what order do i install them in"?

Ok, so that sounds like something we can improve and add some explanation of what it actually installs. @PaulRMellor is this something you can have a look at?

davesargrad commented 2 years ago

Ok.. @scholzj You da man.. you've steered me straight.. my colleague and I will give that a shot.

ralphflat commented 2 years ago

@scholzj I am working with @davesargrad and we are implementing the process that you have now taught us (install everything in the folder install/cluster-operator, and rely on eventual consistency). We are still not clear if we need to deal with the 5 injected namespaces that we saw when looking at "latest". The namespace myproject is still seen in install/cluster-operator files (the need to sed replace myproject is not seen in the documentation):

image

scholzj commented 2 years ago

@ralphflat I think that is tied to the other issue - #7048? As I said, the docs basically operate with the unpacked ZIP files from the GitHub release page. There the namespaces should be only inside the Bindings, but not in any metadata.

PaulRMellor commented 2 years ago

@davesargrad Thanks for this very useful feedback. I'll look into how we might change the docs to make the install process clearer in this regard

maxisam commented 2 years ago

I have the same question as well. But I think having multiple operators makes more sense to me, so I can update the operator for dev first and make sure everything is working. I tried to use STRIMZI_RBAC_SCOPE for extraEnvs in helm chart, but it seems like it doesn't work

UPDATE:

Seems like multiple operators is not supported https://github.com/strimzi/strimzi-kafka-operator/discussions/4389

scholzj commented 2 years ago

@maxisam You have to keep in mind that the CRDs exist only once in your Kubernetes cluster. So deploying multiple operators is non-trivial because they will never be completely independent. In general, you can do that by installing the ClusterRoles and CRDs from the latest version you want to use. But you would need to be very careful since for example deleting one installation might result in deleting all your Kafka clusters etc. This is a Kubernetes limitation which is hard to work around - because of the risks etc., I do not think we will document this in any way.