Open applike-ss opened 8 months ago
Due to config adjustments, the operator tries to patch the sts in a way that would be incompatible.
Could you share the config that led to the following behaviour? Updating dragonfly CRD caused the issue? I am interested to know the root cause.
It would be great if drangonfly-operator could delete the sts with cascade=orphan and then re-create it with the current config to ensure the desired state.
Recreating the statefulset wouldn't solve the underlying issue (i.e. why is the operator trying to update statefulset like that).
I was thinking that when i now remove this sts manually, the operator would re-create it to ensure the desired state. This was also not the case and i would like the operator to ensure the desired state of having the sts with the desired configuration re-created as well.
Yep, its indeed nice to have.
I've the same problem. I deploy the CRD with ArgoCD, but the Operator does not update anything, no trigger for rollout replace. And after that, I can see un the logs :
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.4/pkg/internal/controller/controller.go:235 2024-03-20T16:29:41Z ERROR Reconciler error {"controller": "dragonfly", "controllerGroup": "dragonflydb.io", "controllerKind": "Dragonfly", "Dragonfly": {"name":"dragonfly-test","namespace":"test"}, "namespace": "test", "name": "dragonfly-test", "reconcileID": "4261fe51-f541-437b-9a2e-cf6b64b253db", "error": "StatefulSet.apps \"dragonfly-test\" is invalid: spec: Forbidden: updates to statefulset spec for fields other than 'replicas', 'ordinals', 'template', 'updateStrategy', 'persistentVolumeClaimRetentionPolicy' and 'minReadySeconds' are forbidden"} sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.4/pkg/internal/controller/controller.go:329 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.4/pkg/internal/controller/controller.go:274 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2 /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.4/pkg/internal/controller/controller.go:235
Could you share the config that led to the following behaviour? Updating dragonfly CRD caused the issue? I am interested to know the root cause.
It was indeed updating the CR. I was updating the spec.snapshot.persistentVolumeClaimSpec
. This leads to an update in the sts' spec.persistentVolumeClaim
path, which is usually not allowed.
So my suggestion is to allow this change by removing the sts with cascade=false option and re-create it.
Recreating the statefulset wouldn't solve the underlying issue (i.e. why is the operator trying to update statefulset like that).
That is true, updating parts of the CR that are not supposed to be updated from sts side is.
Here's a demo CR:
apiVersion: dragonflydb.io/v1alpha1
kind: Dragonfly
metadata:
name: dragonfly-app
spec:
image: ghcr.io/dragonflydb/dragonfly-weekly:e8650ed2b4ebd550c966751dd33ebb1ac4f82b1f-ubuntu
args:
- '--cache_mode'
- '--primary_port_http_enabled=true'
- '--cluster_mode=emulated'
snapshot:
cron: '*/5 * * * *'
persistentVolumeClaimSpec:
resources:
requests:
storage: 1Gi
accessModes:
- ReadWriteOnce
resources:
limits:
cpu: 100m
memory: 320Mi
requests:
cpu: 100m
memory: 320Mi
replicas: 3
Updating this to the following will show the issue:
apiVersion: dragonflydb.io/v1alpha1
kind: Dragonfly
metadata:
name: dragonfly-app
spec:
image: ghcr.io/dragonflydb/dragonfly-weekly:e8650ed2b4ebd550c966751dd33ebb1ac4f82b1f-ubuntu
args:
- '--cache_mode'
- '--primary_port_http_enabled=true'
- '--cluster_mode=emulated'
snapshot:
cron: '*/5 * * * *'
persistentVolumeClaimSpec:
resources:
requests:
storage: 2Gi
accessModes:
- ReadWriteOnce
resources:
limits:
cpu: 100m
memory: 320Mi
requests:
cpu: 100m
memory: 320Mi
replicas: 3
Would be awesome if you can make this one work @Abhra303 (https://github.com/dragonflydb/dragonfly-operator/pull/222)
Yep, am busy with other stuff currently, will fix the PR soon!
Same for resurces. For example, if you change memory resources in CR yaml from 2G to 4GB of memory and apply the changes, the statefulset will not be updated with a new memory settings. You need to edit or patch sts manually.
clusters:
- name: redis
resources:
requests:
memory: 4Gi
cpu: 200m
limits:
memory: 4Gi
Due to config adjustments, the operator tries to patch the sts in a way that would be incompatible.
I am getting this error then:
It would be great if drangonfly-operator could delete the sts with cascade=orphan and then re-create it with the current config to ensure the desired state.
I was thinking that when i now remove this sts manually, the operator would re-create it to ensure the desired state. This was also not the case and i would like the operator to ensure the desired state of having the sts with the desired configuration re-created as well.