kubernetes-sigs / kubebuilder

Kubebuilder - SDK for building Kubernetes APIs using CRDs
http://book.kubebuilder.io
Apache License 2.0
7.77k stars 1.43k forks source link

On the use of the Index Field #1879

Closed champak closed 3 years ago

champak commented 3 years ago

Hi Folks, I wanted to validate my understanding of the Indexing that we recommend in SetupWithManager(). This is a useful optimization but we could choose to do without it if needed, right ? I want to take that approach for a scenario where I have a pair of custom resources in the same manager both of which monitor similar resources and hence I could get an index conflict if I were to use an index (that would be off the same keytype since monitored resources are same) with both. I do not care about the performance of one of them and expect to be able to do without the indexer. Thanks for any input.

camilamacedo86 commented 3 years ago

Hi @champak,

I understand that you are speaking over the example in https://github.com/kubernetes-sigs/kubebuilder/blob/master/docs/book/src/cronjob-tutorial/testdata/project/controllers/cronjob_controller.go#L547-L563. Am I right?

Unfortunately, your question is not clear enough for me. WDYT about to provide the following info?

#### What did you do?

<!-- A clear and concise description of the steps you took (or insert a code snippet). -->

#### What did you expect to see?

<!-- A clear and concise description of what you expected to happen (or insert a code snippet). -->

#### What did you see instead? Under which circumstances?

<!-- A clear and concise description of what you expected to happen (or insert a code snippet). -->

It might help us understand better what you re trying to do and achieve.

varshaprasad96 commented 3 years ago

/assign

champak commented 3 years ago

@camilamacedo86 Thanks for taking a look. I have tried to add details below:

What did you do?

Like you mentioned this is the code I was looking at and have used similar constructs https://github.com/kubernetes-sigs/kubebuilder/blob/master/docs/book/src/cronjob-tutorial/testdata/project/controllers/cronjob_controller.go#L547-L563

I was interested in a scenario where I have more than one controller in the same project that monitors common objects (like batch in the example). In that case I had found that attempting to use the same indexer in both controllers led to a conflict (because the manager object would call GetFieldIndexer on the same object from two places) and decided to not use the indexer in one of the controllers.

What did you expect to see?

Things seemed to work OK without the indexer and from looking at the code it seems that the indexing is a (nice) optimization. I wanted to confirm that and make sure that having the indexer is not a requirement for correct behavior. If there is a code or discussion pointer on the indexer behavior please add to the thread.

Thanks for maintaining this awesome project !

What did you see instead? Under which circumstances?

camilamacedo86 commented 3 years ago

Hi @champak,

I was interested in a scenario where I have more than one controller in the same project that monitors common objects (like batch in the example)

It is recommended develop idempotent solutions whcih means that each controller in the project would be responsible for synchronizing the desired state as represented by their Specs and the actual state of the system. In this way, it works like a loop, and it does not stop until all conditionals match its implementation.

In general, it's recommended to have one controller responsible for manage each API(CRD) or an external type created on the project in order to not go against the design goals set by controller-runtime.

However, you could have more that one controller responsible for synchronizing an API. But then, ideally, each controller would need to have predicates to know what resources it would be responsible for. E.g : controller A will be responsible for to reconcile cronJobs that has a label X when controller B will only reconcile cronJobs with label Y. I hope the doc https://sdk.operatorframework.io/docs/building-operators/golang/references/event-filtering/ help you with it.

For you getting a better idea, let’s think about the classic scenario where the goal is to have an application and its database running on the platform with Kubernetes. Then, one object could represent the App, and another one could represent the DB. By having one CRD to describe the App and another one for the DB, we will not be hurting concepts such as encapsulation, the single responsibility principle, and cohesion. Damaging these concepts could cause unexpected side effects, such as difficulty in extending, reuse, or maintenance, just to mention a few.

In conclusion, the App CRD will have its controller which would be responsible for things like creating Deployments that contain the App and creating Services to access it. Similarly, we would create a CRD to represent the DB, and deploy a controller that would manage DB instances.

Consequently, by using the tool, we can create our APIs and objects that will represent our solutions on these platforms. Dee that the projects can have as many Kinds as needed (1…N). Basically, the CRDs are a definition of our customized Objects, and the CRs are an instance of it.

Things seemed to work OK without the indexer and from looking at the code it seems that the indexing is a (nice) optimization. I wanted to confirm that and make sure that having the indexer is not a requirement for correct behavior.

The indexing in the example is not required for all and any scenario. It is only an example to show an option to optimize when it is required to looking for a big list of resources on the cluster side.

So, I am closing this one since shows that we could cover your questions in the above description. However, please feel free to raise new issues as you need to.

champak commented 3 years ago

@camilamacedo86 Thanks much for your detailed response and the event filtering pointer ! The event filtering capability looks quite generic and could be possibly used for filtering on things like namespaces. I shall have to try it out.