dragonflydb / dragonfly-operator

A Kubernetes operator to install and manage Dragonfly instances.
https://www.dragonflydb.io/docs/managing-dragonfly/operator/installation
Apache License 2.0
132 stars 28 forks source link

"Getting started" example leads to underreplicated cluster #235

Closed michael-ylb closed 1 month ago

michael-ylb commented 1 month ago

Describe the bug When you set the cpu requests above cpu limits (using the Operator), the operator will terminate the replicas and end up in an unstable state.

To Reproduce Steps to reproduce the behavior: Follow these steps (I left out "Pass custom Dragonfly arguments"): https://www.dragonflydb.io/docs/getting-started/kubernetes-operator The last step will be: kubectl patch dragonfly dragonfly-sample --type merge -p '{"spec":{"resources":{"requests":{"cpu":"2"}}}}'

The cpu requests for the "dragonfly" resource will be above the cpu limits (600m). The operator applies these settings to the StatefulSet. The Statefulset will terminate the replicas, but is not able to create new pods: "create Pod dragonfly-sample-1 in StatefulSet dragonfly-sample failed error: Pod "dragonfly-sample-1" is invalid: spec.containers[0].resources.requests: Invalid value: "2": must be less than or equal to cpu limit"

The operator will stay in this state and will not update the Statefulset, even when the cpu requests are set to a valid number again.

Expected behavior The operator should never apply any operation that leads to this state. It should be able to recover.

Environment (please complete the following information):

Abhra303 commented 1 month ago

Hi @michael-ylb , the pod termination is the expected behaviour as it exceeds the cpu limit. However, the operator ideally should recreate new pods once the crd is updated with configuration. There is already a ticket #165 for this and the fix will be patched in the next release. I am closing this w.r.t that ticket.

Abhra303 commented 1 month ago

I will update the docs in the meantime. Thanks!