operator-framework / java-operator-sdk

Java SDK for building Kubernetes Operators
https://javaoperatorsdk.io/
Apache License 2.0
804 stars 216 forks source link

Generic Controller to Detect if Pod was Evicted by Node Upgrade #2534

Open csviri opened 1 month ago

csviri commented 1 month ago

During a node upgrade, pods get drained from the node. For long-running applications where frequent restarts are not desirable, it would be useful to get information about the reason for pod eviction, especially if it was because of this node upgrade.

This could be solved with a generic controller that watches pods, and nodes, and in case of pod eviction checks if the node is being drained and sends a notification to a listener interface about the pod.

metacosm commented 1 month ago

Perhaps this should be a separate project, though?

csviri commented 1 month ago

Maybe just a separate module, within this project. Since it's called SDK, at least in my mind tool/libs for common subproblems fits. What do you think?

metacosm commented 1 month ago

I get the point but adding "random" utilities to the SDK project dilutes the SDK itself, in my opinion, though we could make it an example operator that would also be actually useful…

csviri commented 1 month ago

The thing is that the notification system might vary, based how the platform handles such events in a specific company, som might use kubernetes events others kafka messages to get these specific notifications.

metacosm commented 1 month ago

Then it makes even less sense to be part of the SDK if we cannot have a solution that works generically. Or am I missing something?

csviri commented 1 month ago

Usually it works like this, companies have internal forks and internal builds of such open source projects (at least in my experience from multiple companies), where these extension points are used to fulfill internal requirements. See for example resource listener In Flink Operator: https://github.com/apache/flink-kubernetes-operator/blob/d946f3f9f3a7f12098cd82db2545de7c89e220ff/flink-kubernetes-operator-api/src/main/java/org/apache/flink/kubernetes/operator/api/listener/FlinkResourceListener.java#L36

The open source project actually does not provide any implementation (only for tests), but anyone in their internal fork can provide one.