OT-CONTAINER-KIT / redis-operator

A golang based redis operator that will make/oversee Redis standalone/cluster/replication/sentinel mode setup on top of the Kubernetes.
https://ot-redis-operator.netlify.app/
Apache License 2.0
752 stars 209 forks source link

Regression: Operator 0.15 can't create a RedisCluster since it creates a StatefullSet without storage definitions for node-conf #560

Open michaelarnauts opened 1 year ago

michaelarnauts commented 1 year ago

What version of redis operator are you using?

redis-operator version: 0.15

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (kubectl version)?

Client Version: 4.11.0-202212070335.p0.g1928ac4.assembly.stream-1928ac4 Kustomize Version: v4.5.4 Kubernetes Version: v1.25.10+8c21020

What did you do?

Upgraded to Operator 0.15, created a new RedisCluster as before.

What did you expect to see?

Everything works

What did you see instead?

The cluster doesn't start. A redis-leader statefullset is created. When inspecting the YAML, you can see this definition that doesn't contain a resources config.

...
  volumeClaimTemplates:
    - kind: PersistentVolumeClaim
      apiVersion: v1
      metadata:
        name: node-conf
        creationTimestamp: null
        labels:
          app: redis-leader
          redis_setup_type: cluster
          role: leader
        annotations:
          redis.opstreelabs.in: 'true'
          redis.opstreelabs.instance: redis
      spec:
        accessModes:
          - ReadWriteOnce
        resources: {}
        volumeMode: Filesystem
      status:
        phase: Pending
...

The StatefullSet is complaining about this:

create Claim node-conf-redis-leader-0 for Pod redis-leader-0 in StatefulSet redis-leader failed error: PersistentVolumeClaim "node-conf-redis-leader-0" is invalid: spec.resources[storage]: Required value

create Pod redis-leader-0 in StatefulSet redis-leader failed error: failed to create PVC node-conf-redis-leader-0: PersistentVolumeClaim "node-conf-redis-leader-0" is invalid: spec.resources[storage]: Required value

Manually modifying the YAML isn't supported, I get the error:

Error "Forbidden: updates to statefulset spec for fields other than 'replicas', 'template', 'updateStrategy', 'persistentVolumeClaimRetentionPolicy' and 'minReadySeconds' are forbidden" for field "spec".
michaelarnauts commented 1 year ago

I had to go over all recent merge request to find out that a nodeConfVolumeClaimTemplate had been added.

The YAML template when adding a RedisCluster isn't updated. Also using an existing manifest that works fine in 0.14 doesn't work anymore in 0.15. I think more time should be taken to make sure clusters don't break when the operator is updated.

image

I had to modify the spec.kubernetesConfig.image to quay.io/opstree/redis:v7.0.11 and also add the nodeConfVolumeClaimTemplate node to spec.storage:

    nodeConfVolumeClaimTemplate:
      spec:
        accessModes:
          - ReadWriteOnce
        resources:
          requests:
            storage: 1Gi
shubham-cmyk commented 1 year ago

@michaelarnauts Is it fixed now ?

There were few breaking change in the v0.15.0.

michaelarnauts commented 1 year ago

@michaelarnauts Is it fixed now ?

There were few breaking change in the v0.15.0.

I got it to work with the changes I mentioned above. The template that is shown when adding a RedisCluster should still be modified though, since the current outdated one doesn't work and is confusing.

shubham-cmyk commented 1 year ago

Can you mention which manifest is not updated? Since I did updated all the examples. If there is some left

michaelarnauts commented 1 year ago

Not sure where it's stored. It's the template that is shown when adding a RedisCluster trough the webinterface of OpenShift. Maybe it's packaged together when it's pushed to the "Operator store"?

darkrift commented 12 months ago

I think the problem might be related to the schema which doesn't specify a default value. When this was modified to be configurable, the default values should have been defined to what was actually defined as code in order not to break 0.14.0 compatible versions.

tayyiposmanoglu commented 10 months ago

Is there any updates about this bug? When will 0.16.0 version be released?

tayyiposmanoglu commented 8 months ago

up