OT-CONTAINER-KIT / redis-operator

A golang based redis operator that will make/oversee Redis standalone/cluster/replication/sentinel mode setup on top of the Kubernetes.
https://ot-redis-operator.netlify.app/
Apache License 2.0
738 stars 207 forks source link

Using a password with sentinel causes +sdown #779

Open nathan-bowman opened 4 months ago

nathan-bowman commented 4 months ago

redis-operator version: quay.io/opstree/redis-operator:v0.15.1

Does this issue reproduce with the latest release? Untested, I believe that the latest tagged image is v0.15.0

What operating system and processor architecture are you using (kubectl version)?

kubectl version Output
Client Version: v1.29.0
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.28.5-eks-5e0fdde

What did you do? kustomization.yaml

---
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
namespace: redis

# https://github.com/OT-CONTAINER-KIT/helm-charts/tree/main/charts/redis-operator
helmCharts:
- name: redis-operator
  repo: https://ot-container-kit.github.io/helm-charts/
  # Check the operator version against image compatability
  # https://github.com/OT-CONTAINER-KIT/redis-operator?tab=readme-ov-file#image-compatibility
  version: 0.15.9
  releaseName: redis-operator
  namespace: redis
  valuesFile: redis-operator-helm-values.yaml
  includeCRDs: true

resources:
# Configs
- ./configs/redis-configmap.yaml
- ./configs/redis-sentinel.yaml
- ./configs/redis-replication.yaml
- ./configs/redis-secrets-store.yaml
- ./configs/redis-secrets.yaml

redis-operator-helm-values.yaml

---
redisOperator:
  name: redis-operator
  imageName: quay.io/opstree/redis-operator
  imageTag: "v0.15.1"
  imagePullPolicy: Always

  certmanager:
    enabled: true

  certificate:
    name: serving-cert
    secretName: webhook-server-cert

  issuer:
    type: selfSigned
    name: redis-operator-issuer
    email: me@email.com
    server: https://acme-v02.api.letsencrypt.org/directory
    privateKeySecretName: letsencrypt
    solver:
      enabled: true
      ingressClass: nginx

  resources:
    limits:
      cpu: 500m
      memory: 500Mi
    requests:
      cpu: 500m
      memory: 500Mi

  replicas: 1

  serviceAccountName: redis-operator

  service:
    name: webhook-service
    namespace: redis

redis-replication.yaml

---
apiVersion: redis.redis.opstreelabs.in/v1beta2
kind: RedisReplication
metadata:
  name: redis-replication
spec:
  clusterSize: 3
  serviceAccountName: redis-serviceaccount
  podSecurityContext:
    runAsUser: 1000
    fsGroup: 1000
  kubernetesConfig:
    # https://quay.io/repository/opstree/redis?tab=tags
    image: quay.io/opstree/redis:v7.0.12
    imagePullPolicy: IfNotPresent
    resources:
      requests:
        cpu: '4'
        memory: 20Gi
      limits:
        cpu: '6'
        memory: 24Gi
    redisSecret:
      name: redis
      key: password
  storage:
    volumeClaimTemplate:
      spec:
        accessModes: ["ReadWriteOnce"]
        resources:
          requests:
            storage: 1Gi

redis-sentinel.yaml

---
apiVersion: redis.redis.opstreelabs.in/v1beta2
kind: RedisSentinel
metadata:
  name: redis-sentinel
spec:
  clusterSize: 3
  podSecurityContext:
    runAsUser: 1000
    fsGroup: 1000
  redisSentinelConfig:
    redisReplicationName: redis-replication
  kubernetesConfig:
    # https://quay.io/repository/opstree/redis-sentinel?tab=tags
    image: quay.io/opstree/redis-sentinel:v7.0.12
    imagePullPolicy: IfNotPresent
    resources:
      requests:
        cpu: 101m
        memory: 128Mi
      limits:
        cpu: 101m
        memory: 128Mi
    redisSecret:
      name: redis
      key: password

This seems to be the same issue as: https://github.com/OT-CONTAINER-KIT/redis-operator/issues/637

The only way I could get sentinel to work was by adding this:

  redisSentinelConfig:
    additionalSentinelConfig: addtlsentinel

Which pointed to configmap addtlsentinel:

apiVersion: v1
data:
  redis-sentinel-additional.conf: |
    sentinel auth-pass myMaster 123412341234
kind: ConfigMap
metadata:
  name: addtlsentinel
  namespace: redis

...which gets picked up by: https://github.com/OT-CONTAINER-KIT/redis/blob/master/entrypoint-sentinel.sh#L12

Would this merge fix my issue?

arusa commented 3 months ago

I think you are right and this problem was fixed 3 months ago with this commit https://github.com/OT-CONTAINER-KIT/redis-operator/commit/8e8ded98acd13d3660d155bf0c510096121b669a

But there was no release in the last 3 months, so I'll have to try your workaround with additionalSentinelConfig. Thanks for sharing!

arusa commented 3 months ago

OMG, this issue is really bad. Even knowing your solution and searching the code for additionalSentinelConfig I'm not able to find out how this parameter is supposed to be used. There is absolutely no documentation and I can't find any code that shows that that string is actually used to load anything from a ConfigMap. And now I'm stuck again because I would have to manually add the cleartext password to the ConfigMap and can't use the one that's saved in a Secret?

I think I'll have to remove the redis password completely from the whole replication.

nathan-bowman commented 3 months ago

OMG, this issue is really bad. Even knowing your solution and searching the code for additionalSentinelConfig I'm not able to find out how this parameter is supposed to be used. There is absolutely no documentation and I can't find any code that shows that that string is actually used to load anything from a ConfigMap. And now I'm stuck again because I would have to manually add the cleartext password to the ConfigMap and can't use the one that's saved in a Secret?

I think I'll have to remove the redis password completely from the whole replication.

Yeah it's a bit of a pain to work around. I messaged their team on Slack to get traction with pushing the next release so we can have those changes available, but it's been 19 days so far with no movement: https://opstree.slack.com/archives/C05MBRB50JG/p1708529971281929

drivebyer commented 2 months ago

@nathan-bowman the v0.16.0 is released https://github.com/OT-CONTAINER-KIT/redis-operator/releases/tag/v0.16.0

arusa commented 2 months ago

Is the problem solved? How does the solution look like?

alita1991 commented 2 months ago

Hi, I have the same issue with 0.15.1, the helm chart was not yet updated to use the latest version of the operator (0.16.0). Is there any plan to update the helm chart anytime soon? Also, the image is not published yet at quay.io/opstree/redis-operator

nathan-bowman commented 2 months ago

Hi, I have the same issue with 0.15.1, the helm chart was not yet updated to use the latest version of the operator (0.16.0). Is there any plan to update the helm chart anytime soon? Also, the image is not published yet at quay.io/opstree/redis-operator

I'm still waiting on the Operator 0.16.0 Helm chart to get released as well...