We are running single instance kafka in dev env on spot env, and observed below behavior:
drain cleaner keeps annotating kafka for manual restart all the time, the strimzi operator restarts the pod and this keeps going forwards, the node which kafka is running on didn't had any cordon annotations eventually and the kafka pod keeps being rescheduled on the same node.
drain cleaner log:
`
2024-04-21 13:50:26,425 INFO [io.str.ValidatingWebhook] (executor-thread-5130) Denying request for eviction of Pod kafka-data-kafka-0 in namespace nightly-backend-kafka-data
2024-04-21 13:50:36,442 INFO [io.str.ValidatingWebhook] (executor-thread-5130) Received eviction webhook for Pod kafka-data-kafka-0 in namespace nightly-backend-kafka-data
2024-04-21 13:50:36,442 INFO [io.str.ValidatingWebhook] (executor-thread-5130) Pod kafka-data-kafka-0 in namespace nightly-backend-kafka-data is already annotated for restart
2024-04-21 13:50:36,442 INFO [io.str.ValidatingWebhook] (executor-thread-5130) Denying request for eviction of Pod kafka-data-kafka-0 in namespace nightly-backend-kafka-data
2024-04-21 13:50:46,458 INFO [io.str.ValidatingWebhook] (executor-thread-5130) Received eviction webhook for Pod kafka-data-kafka-0 in namespace nightly-backend-kafka-data
2024-04-21 13:50:46,459 INFO [io.str.ValidatingWebhook] (executor-thread-5130) Pod kafka-data-kafka-0 in namespace nightly-backend-kafka-data should be annotated for restart
2024-04-21 13:50:46,591 INFO [io.str.ValidatingWebhook] (executor-thread-5130) Pod kafka-data-kafka-0 in namespace nightly-backend-kafka-data was patched
2024-04-21 13:50:46,591 INFO [io.str.ValidatingWebhook] (executor-thread-5130) Denying request for eviction of Pod kafka-data-kafka-0 in namespace nightly-backend-kafka-data
2024-04-21 13:50:56,608 INFO [io.str.ValidatingWebhook] (executor-thread-5130) Received eviction webhook for Pod kafka-data-kafka-0 in namespace nightly-backend-kafka-data
2024-04-21 13:50:56,608 INFO [io.str.ValidatingWebhook] (executor-thread-5130) Pod kafka-data-kafka-0 in namespace nightly-backend-kafka-data is already annotated for restart
2024-04-21 13:50:56,609 INFO [io.str.ValidatingWebhook] (executor-thread-5130) Denying request for eviction of Pod kafka-data-kafka-0 in namespace nightly-backend-kafka-data
2024-04-21 13:51:06,621 INFO [io.str.ValidatingWebhook] (executor-thread-5130) Received eviction webhook for Pod kafka-data-kafka-0 in namespace nightly-backend-kafka-data
2024-04-21 13:51:06,621 INFO [io.str.ValidatingWebhook] (executor-thread-5130) Pod kafka-data-kafka-0 in namespace nightly-backend-kafka-data is already annotated for restart
2024-04-21 13:51:06,621 INFO [io.str.ValidatingWebhook] (executor-thread-5130) Denying request for eviction of Pod kafka-data-kafka-0 in namespace nightly-backend-kafka-data
2024-04-21 13:51:16,633 INFO [io.str.ValidatingWebhook] (executor-thread-5130) Received eviction webhook for Pod kafka-data-kafka-0 in namespace nightly-backend-kafka-data
2024-04-21 13:51:16,633 INFO [io.str.ValidatingWebhook] (executor-thread-5130) Pod kafka-data-kafka-0 in namespace nightly-backend-kafka-data is already annotated for restart
2024-04-21 13:51:16,633 INFO [io.str.ValidatingWebhook] (executor-thread-5130) Denying request for eviction of Pod kafka-data-kafka-0 in namespace nightly-backend-kafka-data
2024-04-21 13:51:26,650 INFO [io.str.ValidatingWebhook] (executor-thread-5131) Received eviction webhook for Pod kafka-data-kafka-0 in namespace nightly-backend-kafka-data
2024-04-21 13:51:26,650 INFO [io.str.ValidatingWebhook] (executor-thread-5131) Pod kafka-data-kafka-0 in namespace nightly-backend-kafka-data is already annotated for restart
2024-04-21 13:51:26,650 INFO [io.str.ValidatingWebhook] (executor-thread-5131) Denying request for eviction of Pod kafka-data-kafka-0 in namespace nightly-backend-kafka-data
2024-04-21 13:51:36,664 INFO [io.str.ValidatingWebhook] (executor-thread-5131) Received eviction webhook for Pod kafka-data-kafka-0 in namespace nightly-backend-kafka-data
2024-04-21 13:51:36,664 INFO [io.str.ValidatingWebhook] (executor-thread-5131) Pod kafka-data-kafka-0 in namespace nightly-backend-kafka-data is already annotated for restart
2024-04-21 13:51:36,664 INFO [io.str.ValidatingWebhook] (executor-thread-5131) Denying request for eviction of Pod kafka-data-kafka-0 in namespace nightly-backend-kafka-data
2024-04-21 13:51:46,676 INFO [io.str.ValidatingWebhook] (executor-thread-5131) Received eviction webhook for Pod kafka-data-kafka-0 in namespace nightly-backend-kafka-data
`
strimzi operator:
2024-04-21 13:56:28 INFO AbstractOperator:517 - Reconciliation #52(timer) KafkaConnect(nightly-backend-kafka-data/connect): reconciled 2024-04-21 13:56:30 INFO KafkaRoller:496 - Reconciliation #49(timer) Kafka(nightly-backend-kafka-data/kafka-data): Pod {kafka-data-kafka-0} needs to be restarted. Reason: [manual rolling update annotation on a pod] 2024-04-21 13:56:31 INFO AbstractOperator:517 - Reconciliation #51(timer) Kafka(nightly-backend-kafka/nightly-backend-kafka): reconciled 2024-04-21 13:56:31 INFO AbstractOperator:517 - Reconciliation #50(timer) Kafka(stress-backend-kafka/backend-kafka): reconciled 2024-04-21 13:56:32 INFO PodOperator:54 - Reconciliation #49(timer) Kafka(nightly-backend-kafka-data/kafka-data): Rolling pod kafka-data-kafka-0
karpenter log:
2024-04-21T13:59:16.602Z ERROR controller.eviction evicting pod, admission webhook "strimzi-drain-cleaner.strimzi.io" denied the request: The pod will be rolled by the Strimzi Cluster Operator {"commit": "34d50bf-dirty", "pod": "nightly-backend-kafka-data/kafka-data-kafka-0"}
Eks 1.27 Drain cleaner 1.1.0 karpenter 0.29.2 strimzi operator 0.29 kafka 3.2.0
We are running single instance kafka in dev env on spot env, and observed below behavior: drain cleaner keeps annotating kafka for manual restart all the time, the strimzi operator restarts the pod and this keeps going forwards, the node which kafka is running on didn't had any cordon annotations eventually and the kafka pod keeps being rescheduled on the same node.
drain cleaner log: ` 2024-04-21 13:50:26,425 INFO [io.str.ValidatingWebhook] (executor-thread-5130) Denying request for eviction of Pod kafka-data-kafka-0 in namespace nightly-backend-kafka-data 2024-04-21 13:50:36,442 INFO [io.str.ValidatingWebhook] (executor-thread-5130) Received eviction webhook for Pod kafka-data-kafka-0 in namespace nightly-backend-kafka-data 2024-04-21 13:50:36,442 INFO [io.str.ValidatingWebhook] (executor-thread-5130) Pod kafka-data-kafka-0 in namespace nightly-backend-kafka-data is already annotated for restart 2024-04-21 13:50:36,442 INFO [io.str.ValidatingWebhook] (executor-thread-5130) Denying request for eviction of Pod kafka-data-kafka-0 in namespace nightly-backend-kafka-data 2024-04-21 13:50:46,458 INFO [io.str.ValidatingWebhook] (executor-thread-5130) Received eviction webhook for Pod kafka-data-kafka-0 in namespace nightly-backend-kafka-data 2024-04-21 13:50:46,459 INFO [io.str.ValidatingWebhook] (executor-thread-5130) Pod kafka-data-kafka-0 in namespace nightly-backend-kafka-data should be annotated for restart 2024-04-21 13:50:46,591 INFO [io.str.ValidatingWebhook] (executor-thread-5130) Pod kafka-data-kafka-0 in namespace nightly-backend-kafka-data was patched 2024-04-21 13:50:46,591 INFO [io.str.ValidatingWebhook] (executor-thread-5130) Denying request for eviction of Pod kafka-data-kafka-0 in namespace nightly-backend-kafka-data 2024-04-21 13:50:56,608 INFO [io.str.ValidatingWebhook] (executor-thread-5130) Received eviction webhook for Pod kafka-data-kafka-0 in namespace nightly-backend-kafka-data 2024-04-21 13:50:56,608 INFO [io.str.ValidatingWebhook] (executor-thread-5130) Pod kafka-data-kafka-0 in namespace nightly-backend-kafka-data is already annotated for restart 2024-04-21 13:50:56,609 INFO [io.str.ValidatingWebhook] (executor-thread-5130) Denying request for eviction of Pod kafka-data-kafka-0 in namespace nightly-backend-kafka-data 2024-04-21 13:51:06,621 INFO [io.str.ValidatingWebhook] (executor-thread-5130) Received eviction webhook for Pod kafka-data-kafka-0 in namespace nightly-backend-kafka-data 2024-04-21 13:51:06,621 INFO [io.str.ValidatingWebhook] (executor-thread-5130) Pod kafka-data-kafka-0 in namespace nightly-backend-kafka-data is already annotated for restart 2024-04-21 13:51:06,621 INFO [io.str.ValidatingWebhook] (executor-thread-5130) Denying request for eviction of Pod kafka-data-kafka-0 in namespace nightly-backend-kafka-data 2024-04-21 13:51:16,633 INFO [io.str.ValidatingWebhook] (executor-thread-5130) Received eviction webhook for Pod kafka-data-kafka-0 in namespace nightly-backend-kafka-data 2024-04-21 13:51:16,633 INFO [io.str.ValidatingWebhook] (executor-thread-5130) Pod kafka-data-kafka-0 in namespace nightly-backend-kafka-data is already annotated for restart 2024-04-21 13:51:16,633 INFO [io.str.ValidatingWebhook] (executor-thread-5130) Denying request for eviction of Pod kafka-data-kafka-0 in namespace nightly-backend-kafka-data 2024-04-21 13:51:26,650 INFO [io.str.ValidatingWebhook] (executor-thread-5131) Received eviction webhook for Pod kafka-data-kafka-0 in namespace nightly-backend-kafka-data 2024-04-21 13:51:26,650 INFO [io.str.ValidatingWebhook] (executor-thread-5131) Pod kafka-data-kafka-0 in namespace nightly-backend-kafka-data is already annotated for restart 2024-04-21 13:51:26,650 INFO [io.str.ValidatingWebhook] (executor-thread-5131) Denying request for eviction of Pod kafka-data-kafka-0 in namespace nightly-backend-kafka-data 2024-04-21 13:51:36,664 INFO [io.str.ValidatingWebhook] (executor-thread-5131) Received eviction webhook for Pod kafka-data-kafka-0 in namespace nightly-backend-kafka-data 2024-04-21 13:51:36,664 INFO [io.str.ValidatingWebhook] (executor-thread-5131) Pod kafka-data-kafka-0 in namespace nightly-backend-kafka-data is already annotated for restart 2024-04-21 13:51:36,664 INFO [io.str.ValidatingWebhook] (executor-thread-5131) Denying request for eviction of Pod kafka-data-kafka-0 in namespace nightly-backend-kafka-data 2024-04-21 13:51:46,676 INFO [io.str.ValidatingWebhook] (executor-thread-5131) Received eviction webhook for Pod kafka-data-kafka-0 in namespace nightly-backend-kafka-data
`
strimzi operator:
2024-04-21 13:56:28 INFO AbstractOperator:517 - Reconciliation #52(timer) KafkaConnect(nightly-backend-kafka-data/connect): reconciled 2024-04-21 13:56:30 INFO KafkaRoller:496 - Reconciliation #49(timer) Kafka(nightly-backend-kafka-data/kafka-data): Pod {kafka-data-kafka-0} needs to be restarted. Reason: [manual rolling update annotation on a pod] 2024-04-21 13:56:31 INFO AbstractOperator:517 - Reconciliation #51(timer) Kafka(nightly-backend-kafka/nightly-backend-kafka): reconciled 2024-04-21 13:56:31 INFO AbstractOperator:517 - Reconciliation #50(timer) Kafka(stress-backend-kafka/backend-kafka): reconciled 2024-04-21 13:56:32 INFO PodOperator:54 - Reconciliation #49(timer) Kafka(nightly-backend-kafka-data/kafka-data): Rolling pod kafka-data-kafka-0
karpenter log:
2024-04-21T13:59:16.602Z ERROR controller.eviction evicting pod, admission webhook "strimzi-drain-cleaner.strimzi.io" denied the request: The pod will be rolled by the Strimzi Cluster Operator {"commit": "34d50bf-dirty", "pod": "nightly-backend-kafka-data/kafka-data-kafka-0"}