redhat-developer / service-binding-operator

[Deprecated] The Service Binding Operator: Connecting Applications with Services, in Kubernetes
https://redhat-developer.github.io/service-binding-operator
Apache License 2.0
109 stars 91 forks source link

Watch only for changes on resources participating in a service binding #846

Closed pedjak closed 1 year ago

pedjak commented 3 years ago

As the investigation in #605 shown, the operator monitors a huge number of resources in all namespaces:

Such strategy puts a very high pressure on the operator's event queue on clusters with non-trivial workload, increasing the operator memory consumption, and eventually crashing it.

In order to reduce resource usage and behave reliably, we should start monitoring resources only if they are referred through a service binding:

This strategy is going to reduce significantly the number of events landing in operator's queue and open up possibility to scale better in large clusters containing a lot of namespaces.

Avni-Sharma commented 3 years ago

I think we should also check the memory leak metrics on a namespace based installation and a cluster wide operator installation. To assess how broad/large the impact is.

wtrocki commented 3 years ago

So feature wise SBOperator should:

  1. Watch only for SBO objects (operate on existing state)
  2. when it is created validate if all provided info is right and resources exist. Provide error if problems as part of the status
  3. Mount resources to the app container according to the spec
  4. new flag should be introduced for SBO to watch secrets and underlying CRs for changes so config can be adjusted for those.. but only if user wants that. I see SBO as tool that does the job when asked (SB object created) and optionally can also watch and mount again changes in service CR or secret - but that is optional - I would not want some operator to randomly redeploy my app every time my CR change

CSV watching should be removed IMHO - it is core of the problem.

wtrocki commented 3 years ago

Also this approach will allow sbo to exist on clusters without putting any load on them. Openshift dedicated on start has 962 objects what operator will watch. When this is applied operator will only watch for SB objects and become invisible to admins when not used

wtrocki commented 3 years ago

With little bit of research I think that moving to that model will not cause any API changes - only behaviour under some corner cases. It is good to adapt this model before 1.0 as later it would be hard to do so without braking people's expectations.

This will:

example simplifications

Currently operator looks for all CSV on cluster to get spec (even when user do not want to use it). We can make change to fetch spec (metadata) on reconcile loop only when binding was created.

We can stop reconciling binding that has errors continuously by flagging them

We could listen to kubernetes events on the secrets rather than continuous reconciliations

CC @gorkem

sbose78 commented 3 years ago

To improve the performance:

What can be read: The above list doesn't necessarily impact what can be 'read'. The controller may still be allowed to read a wide variety of types - that's OK!

In addition, we must introduce a way for an admin to configure what can be watched.

pedjak commented 3 years ago

What is the usecase we would like to cover by watching anything besides Service Bindings? As @wtrocki stated above, it is very surprising for app developer that binding between application and service changes on its own. Such systems are hard to reason about.

alexeykazakov commented 3 years ago

So, you are going to stop watching CSVs, right? Our Sandbox clusters have thousands of them.

pmacik commented 1 year ago

We are going to implement watches for resources bound via label selectors. The work is tracked here https://issues.redhat.com/browse/APPSVC-1112