OT-CONTAINER-KIT / helm-charts

A repository which that will contain helm charts with best and security practices.
https://ot-container-kit.github.io/helm-charts
49 stars 84 forks source link

Operator version 0.12.0 upgrade issues #97

Open revathyr13 opened 1 year ago

revathyr13 commented 1 year ago

Hello,

I am trying to upgrade redis operator version from 0.10.0 to 0.12.0 using helm chart. Applied CRDs manually.

And facing 3 issues with the upgrade.

Issue 1

Once the upgrade completed, the operator pods are restarting with the below errors for already existing Redis deployments

panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x140a57e] goroutine 527 [running]: redis-operator/k8sutils.generateRedisClusterContainerParams(0xc0010a0900, 0xc000794a50, 0xc000794a68) /workspace/k8sutils/redis-cluster.go:86 +0x1de redis-operator/k8sutils.RedisClusterSTS.CreateRedisClusterSetup({{0x17272db, 0x6}, 0xc00122b2f0, 0x0, 0xc000794a50, 0xc000794a68}, 0xc0010a0900) /workspace/k8sutils/redis-cluster.go:158 +0x3fa redis-operator/k8sutils.CreateRedisLeader(0xc0010a0900) /workspace/k8sutils/redis-cluster.go:109 +0xa9

Issue 2

In order to check if its a compatibility issue with the existing redis cluster, I have modified it to following and redeployed.

Existing manifest

apiVersion: redis.redis.opstreelabs.in/v1beta1
kind: RedisCluster
metadata:
  name: redis-cluster-dev
spec:
  clusterSize: 5
  kubernetesConfig:
    image: 'opstree-redis:v6.2.5'
    imagePullPolicy: IfNotPresent
    resources:
      requests:
        cpu: 1
        memory: 7Gi
      limits:
        cpu: 2
        memory: 8Gi
    redisSecret:
      name: redis-auth
      key: password
  redisExporter:
    enabled: true
    image: 'opstree-redis-exporter:1.0'
  redisLeader:
    redisConfig:
      additionalRedisConfig: redis-external-config
  redisFollower:
   redisConfig:
      additionalRedisConfig: redis-external-config
  storage:
    volumeClaimTemplate:
      spec:
        accessModes:
          - ReadWriteOnce
        resources:
          requests:
            storage: 10Gi 

Modified to below manifest

kind: RedisCluster
metadata:
  name: redis-cluster-dev
spec:
  clusterSize: 5
  clusterVersion: v7
  securityContext:
    runAsUser: 1000
    fsGroup: 1000
  kubernetesConfig:
    image: 'opstree-redis:v6.2.5'
    imagePullPolicy: IfNotPresent
    resources:
      requests:
        cpu: 1
        memory: 7Gi
      limits:
        cpu: 2
        memory: 8Gi
    redisSecret:
      name: redis-auth
      key: password
  redisExporter:
    enabled: true
    image: 'opstree-redis-exporter:1.0'
  redisLeader:
    redisConfig:
      additionalRedisConfig: redis-external-config
  redisFollower:

    redisConfig:
      additionalRedisConfig: redis-external-config
  storage:
    volumeClaimTemplate:
      spec:
        accessModes:
          - ReadWriteOnce
        resources:
          requests:
            storage: 10Gi

But then I start getting below errors. At this time the operator was not restarting. The same issue occurred with operator version 0.13.0

luster-dev","Request.StatefulSet.Name":"redis-cluster-dev-leader","patch":"{\"metadata\":{\"annotations\":{\"prometheus.io/port\":null,\"prometheus.io/scrape\":null,\"redis.opstreelabs.instance\":\"redis-cluster-dev\"}},\"spec\":{\"serviceName\":\"redis-cluster-dev-leader-headless\",\"template\":{\"metadata\":{\"annotations\":{\"redis.opstreelabs.in\":\"true\",\"redis.opstreelabs.instance\":\"redis-cluster-dev\"}},\"spec\":{\"$setElementOrder/containers\":[{\"name\":\"redis-cluster-dev-leader\"},{\"name\":\"redis-exporter\"}],\"containers\":[{\"livenessProbe\":{\"failureThreshold\":3,\"initialDelaySeconds\":1,\"periodSeconds\":10,\"timeoutSeconds\":1},\"name\":\"redis-cluster-dev-leader\",\"readinessProbe\":{\"failureThreshold\":3,\"initialDelaySeconds\":1,\"periodSeconds\":10,\"timeoutSeconds\":1}},{\"env\":[{\"name\":\"PERSISTENCE_ENABLED\",\"value\":\"true\"},{\"name\":\"REDIS_ADDR\",\"value\":\"redis://localhost:6379\"},{\"name\":\"REDIS_PASSWORD\",\"valueFrom\":{\"secretKeyRef\":{\"key\":\"password\",\"name\":\"redis-auth\"}}},{\"name\":\"SERVER_MODE\",\"value\":\"cluster\"},{\"name\":\"SETUP_MODE\",\"value\":\"cluster\"}],\"image\":\"harbor.kore.korewireless.com/revram/opstree-redis-exporter:1.0\",\"name\":\"redis-exporter\",\"resources\":{}}],\"securityContext\":{\"fsGroup\":1000,\"runAsUser\":1000}}},\"volumeClaimTemplates\":[{\"metadata\":{\"annotations\":{\"redis.opstreelabs.in\":\"true\",\"redis.opstreelabs.instance\":\"redis-cluster-dev\"},\"labels\":{\"app\":\"redis-cluster-dev-leader\",\"redis_setup_type\":\"cluster\",\"role\":\"leader\"},\"name\":\"redis-cluster-dev-leader\"},\"spec\":{\"accessModes\":[\"ReadWriteOnce\"],\"resources\":{\"requests\":{\"storage\":\"10Gi\"}},\"volumeMode\":\"Filesystem\"},\"status\":{\"phase\":\"Pending\"}}]}}"}{"level":"error","ts":1684153093.9882743,"logger":"controller_redis","msg":"Redis stateful update failed","Request.StatefulSet.Namespace":"redis-cluster-dev","Request.StatefulSet.Name":"redis-cluster-dev-leader","error":"StatefulSet.apps \"redis-cluster-dev-leader\" is invalid: spec: Forbidden: updates to statefulset spec for fields other than 'replicas', 'template', and 'updateStrategy' are forbidden","stacktrace":"redis-operator/k8sutils.patchStatefulSet\n\t/workspace/k8sutils/statefulset.go:173\nredis-operator/k8sutils.CreateOrUpdateStateFul\n\t/workspace/k8sutils/statefulset.go:74\nredis-operator/k8sutils.RedisClusterSTS.CreateRedisClusterSetup\n\t/workspace/k8sutils/redis-cluster.go:153\nredis-operator/k8sutils.CreateRedisLeader\n\t/workspace/k8sutils/redis-cluster.go:109\nredis-operator/controllers.(*RedisClusterReconciler).Reconcile\n\t/workspace/controllers/rediscluster_controller.go:68\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:114\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:311\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:227"} {"level":"error","ts":1684153093.9884331,"logger":"controller_redis","msg":"Cannot create statefulset for Redis","Request.StatefulSet.Namespace":"redis-cluster-dev","Request.StatefulSet.Name":"redis-cluster-dev-leader","Setup.Type":"leader","error":"StatefulSet.apps \"redis-cluster-dev-leader\" is invalid: spec: Forbidden: updates to statefulset spec for fields other than 'replicas', 'template', and 'updateStrategy' are forbidden","stacktrace":"redis-operator/k8sutils.CreateRedisLeader\n\t/workspace/k8sutils/redis-cluster.go:109\nredis-operator/controllers.(*RedisClusterReconciler).Reconcile\n\t/workspace/controllers/rediscluster_controller.go:68\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:114\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:311\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:227"}

Issue 3

Added a new namespace in 0.12.0 version manifest in watch namespace and tried deploying below redis-cluster.yaml and start getting below errors:

apiVersion: redis.redis.opstreelabs.in/v1beta1
kind: RedisCluster
metadata:
  name: redis-dev
spec:
  clusterSize: 1
  clusterVersion: v7
  persistenceEnabled: true
  securityContext:
    runAsUser: 1000
    fsGroup: 1000
  kubernetesConfig:
    image: quay.io/opstree/redis:v7.0.5
    imagePullPolicy: IfNotPresent
    resources:
      requests:
        cpu: 101m
        memory: 128Mi
      limits:
        cpu: 101m
        memory: 128Mi
    redisSecret:
      name: redis-auth
      key: password
    # imagePullSecrets:
    #   - name: regcred
  redisExporter:
    enabled: true
    image: quay.io/opstree/redis-exporter:v1.44.0
    imagePullPolicy: Always
    resources:
      requests:
        cpu: 100m
        memory: 128Mi
      limits:
        cpu: 100m
        memory: 128Mi
# Environment Variables for Redis Exporter
    # env:
    # - name: REDIS_EXPORTER_INCL_SYSTEM_METRICS
    #   value: "true"
    # - name: UI_PROPERTIES_FILE_NAME
    #   valueFrom:
    #     configMapKeyRef:
    #       name: game-demo
    #       key: ui_properties_file_name
    # - name: SECRET_USERNAME
    #   valueFrom:
    #     secretKeyRef:
    #       name: mysecret
    #       key: username
#  redisLeader:
#    redisConfig:
#      additionalRedisConfig: redis-external-config
#  redisFollower:
#    redisConfig:
#      additionalRedisConfig: redis-external-config
  storage:
    volumeClaimTemplate:
      spec:
        # storageClassName: standard
        accessModes: ["ReadWriteOnce"]
        resources:
          requests:
            storage: 10Gi
  # nodeSelector:
  #   kubernetes.io/hostname: minikube
  # priorityClassName:
  # Affinity:
  # Tolerations: []

Error

{"level":"info","ts":1684152510.4917114,"logger":"controller_redis","msg":"Pod Counted successfully","Request.RedisManager.Namespace":"redis-cluster-v7-dev","Request.RedisManager.Name":"redis-dev","Count":0,"Container Name":"redis-dev-leader"} {"level":"error","ts":1684152510.880434,"logger":"controller_redis","msg":"Could not execute command","Request.RedisManager.Namespace":"redis-cluster-v7-dev","Request.RedisManager.Name":"redis-dev","Command":["redis-cli","--cluster","add-node","redis-dev-follower-0.redis-dev-follower-headless.redis-cluster-v7-dev.svc:6379","redis-dev-leader-0.redis-dev-leader-headless.redis-cluster-v7-dev.svc:6379","--cluster-slave","-a","XjhRKSMeRSuz2"],"Output":">>> Adding node redis-dev-follower-0.redis-dev-follower-headless.redis-cluster-v7-dev.svc:6379 to cluster redis-dev-leader-0.redis-dev-leader-headless.redis-cluster-v7-dev.svc:6379\n>>> Performing Cluster Check (using node redis-dev-leader-0.redis-dev-leader-headless.redis-cluster-v7-dev.svc:6379)\nM: a53cc99e263464fb6cefd7ee83ac2a6e28f7cbc4 redis-dev-leader-0.redis-dev-leader-headless.redis-cluster-v7-dev.svc:6379\n slots: (0 slots) master\n[OK] All nodes agree about slots configuration.\n>>> Check for open slots...\n>>> Check slots coverage...\n[ERR] Not all 16384 slots are covered by nodes.\n\n","Error":"Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.\n","error":"command terminated with exit code 1","stacktrace":"redis-operator/k8sutils.ExecuteRedisReplicationCommand\n\t/workspace/k8sutils/redis.go:180\nredis-operator/controllers.(*RedisClusterReconciler).Reconcile\n\t/workspace/controllers/rediscluster_controller.go:127\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:114\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:311\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-1runtime@v0.11.0/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:227"}

Can someone please check and let me know what changes to be done to fix the issues?