operator-framework / java-operator-sdk

Java SDK for building Kubernetes Operators
https://javaoperatorsdk.io/
Apache License 2.0
799 stars 215 forks source link

TimerEventSource and InformerEventSource use separate thread pools respectively to improve reconcile speed #2293

Closed ysymi closed 7 months ago

ysymi commented 7 months ago

This is not a bug report, this is a discussion about performance optimization

What did you do?

I am a user of Flink Kubernetes Operator, I found that there is a potential performance issues with huge amount resources. The detail: flink kubernetes operator use updateControl to update custom resource's state with 1 minute interval, when custom resource amount reaches 10000, the worker threads and task queue in the reconcile thread pool will be filled up by these periodic tasks of UpdateControl (from TimerEventSource).
At this time, when we create a new CR or delete a CR , operator can not handle this event timely. maybe after several tens of seconds, operator handle this event.

What did you expect to see?

TimerEventSource And InformerEventSource can be independent of each other so that operator can handle CR's create、 update、delete event

What did you see instead? Under which circumstances?

create cr need long time to be reconciled when lots of timer task with UpdateControl

Environment

Kubernetes cluster type:

$ Mention java-operator-sdk version from pom.xml file

4.4.4

$ java -version

openjdk11

Possible Solution

TimerEventSource and InformerEventSource use separate thread pools respectively, and the thread pool size can be configured separately.

csviri commented 7 months ago

Hi @ysymi, do I understand properly, that the problem is that when there is a huge number of custom resources, with timed reconciliation and a new custom resource is created/updated there will be a significant delay until it is processed?

Note that ideally there should not be a timed reschedule, rather there reconciliation should be always triggered by event sources. I see there other optimization possibilities, like gradually increase this re-schedule (if the use case allows that).

Introducing some "priority events" or similar approached would provide potentially huge complexity.

cc @gyfora

ysymi commented 7 months ago

yes, @csviri, your understanding is correct but I want to emphasize that I don't care about the delay of timed reconciliation, I care about the delay of handling events from k8s, the latter affects the submission speed of jobs. So I want to use some priority control or separate thread pools to avoid mutual interference between the two types of event handling. I just want to bring up this issue to see if there are any reasonable solutions. If "priority events" is too complex, how about "separate thread pools respectively" ?

cc @gyfora

csviri commented 7 months ago

I see, so there are a couple of options without going to separate thread pools, priority events, or other solutions.

Again timed reconciliation is usually an issue in itself, it is in the framework since in some simple cases might easy to start with them, but in general, it should be avoided. Could you describe for what purpose is it used in Flink Operator? Cannot be replaced by an Event Source? Considering also the polling event sources, which should do local state checks and trigger events for reconciliation only when it is needed. So in general this part can be usually optimized very nicely, so the actual problem with the delay would disappear

ysymi commented 7 months ago

When bringing up this issue, I thought of two solutions: one is to separate thread pools, and the other is to set a priority queue. But as you said, setting a priority queue is more complex. So I think separating thread pools may be more reasonable. Of course, if there are not so many updateControl calls, then this problem would not exist in the first place.

As for the Flink Operator... Firstly , flink stream processing job is a long-time running workload, after flink job submited into a k8s cluster (of course operator handle this submission), operator need check flink job's state so that if job was crash , operator can re-submit it to recover job running. However, most of the time, the Operator is doing the logic of periodically checking the state and triggering corresponding maintenance operations.

This is my personal understanding. If there are any inaccuracies or missing details, @gyfora please do some supplement.

csviri commented 7 months ago

@ysymi I discussed this briefly also with @gyfora . To my understanding some parts could be / might be optimized in future Flink Operator (using the polling event sources for some state caching for example).

But, regarding the performance, since the reconciliation in Flink could take a bit longer because of the HTTP requires opposed when everything is in memory (in ideal case). Wouldn't it be a solution to simply just increase the thread pool size of the executor to solve this problem?

When bringing up this issue, I thought of two solutions: one is to separate thread pools, and the other is to set a priority queue. But as you said, setting a priority queue is more complex. So I think separating thread pools may be more reasonable. Of course, if there are not so many updateControl calls, then this problem would not exist in the first place.

Both of these are just an idea, I meant as kinda brainstorming. Both of them would complicate the code. Note that there is no such concept currently even in go counterpart (controller-runtime).

ysymi commented 7 months ago

OK,thanks for discussing this problem @csviri I am going to do some Flink Operator optimize firstly

csviri commented 7 months ago

thx!

will close this issue now, feel free to reopen or create a new one when it comes to that point.