nextcloud / helm

A community maintained helm chart for deploying Nextcloud on Kubernetes.
GNU Affero General Public License v3.0
325 stars 264 forks source link

Nextcloud seems to not be able to connect to redis with default settings #448

Open jvalskis opened 1 year ago

jvalskis commented 1 year ago

Describe your Issue

I have an issue with Nextcloud failing to connect to redis when it's enabled. Here's an excerpt from the logs:

{
  "reqId":"b8E9pguPPMd2bgHNOOj0",
  "level":3,
  "time":"2023-09-26T06:14:19+00:00",
  "remoteAddr":"10.42.5.1",
  "user":"--",
  "app":"PHP",
  "method":"GET",
  "url":"/status.php",
  "message":"session_start(): Failed to read session data: redis (path: tcp://nextcloud-redis-master:6379?auth=\<redacted\>) at /var/www/html/lib/private/Session/Internal.php#222",
  "userAgent":"kube-probe/1.27",
  "version":"27.1.1.0",
  "data":{"app":"PHP"}
}

Which is weird, because when I do nslookup, it seems to resolve fine:

$ nslookup nextcloud-redis-master
Server:         10.43.0.10
Address:        10.43.0.10#53

Name:   nextcloud-redis-master.cloud.svc.cluster.local
Address: 10.43.32.88

What I did as a workaround, was to hardcode the "host" part in redis.config.php as such: 'host' => "nextcloud-redis-master.cloud.svc.cluster.local",. Any ideas what might be wrong here?

Logs and Errors

Describe your Environment

nextcloud:
  defaultConfigs:
    redis.config.php: false
  configs:
    redis.config.php: |-
      <?php
      $CONFIG = array (
        'memcache.distributed' => '\OC\Memcache\Redis',
        'memcache.locking' => '\OC\Memcache\Redis',
        'redis' => array(
          'host' => "nextcloud-redis-master.cloud.svc.cluster.local",
          'port' => getenv('REDIS_HOST_PORT') ?: 6379,
          'password' => getenv('REDIS_HOST_PASSWORD'),
        ),
      );
redis:
    enabled: true
    usePassword: true
    auth:
      existingSecret: nextcloud-auth-config
      existingSecretPasswordKey: redis-password
    master:
      persistence:
        storageClass: longhorn-fast-retain
    replica:
      persistence:
        storageClass: longhorn-fast-retain
jessebot commented 11 months ago

I haven't had time to look into this fully, but wanted to give some more details to help others troubleshoot:

Here's where the full name is defined in our templates, specifically _helpers.tpl: https://github.com/nextcloud/helm/blob/0c6822fd688124124efb02a328f53b71cbdeb2a9/charts/nextcloud/templates/_helpers.tpl#L26-L32

and it gets referenced again here in our _helpers.tpl: https://github.com/nextcloud/helm/blob/0c6822fd688124124efb02a328f53b71cbdeb2a9/charts/nextcloud/templates/_helpers.tpl#L215-L232

@jvalskis when it is unable to connect, could you please log into the container and check the environment variable called REDIS_HOST? Should be something like this to get it:

# where NEXTCLOUD_POD is the actual name of your nextcloud pod
kubectl exec $NEXTCLOUD_POD -- env | grep REDIS_HOST
Kaurin commented 4 months ago

I can replicate the error on my k3s cluster, but the timestamp shows that it happens within 60 seconds of the pod being started so it could be down to just redis not being ready?

I logged in/logged out a few times and the error does not repeat.

OP error:

{
  "reqId": "6dFzKM6uQLlOijIHHBD0",
  "level": 3,
  "time": "2024-05-07T04:29:34+00:00",
  "remoteAddr": "10.42.0.1",
  "user": "--",
  "app": "PHP",
  "method": "GET",
  "url": "/status.php",
  "message": "session_start(): Failed to read session data: redis (path: tcp://nextcloud-troubleshoot-redis-master:6379?auth=changeme) at /var/www/html/lib/private/Session/Internal.php#213",
  "userAgent": "kube-probe/1.28",
  "version": "29.0.0.19",
  "data": {
    "app": "PHP"
  }
}

Another error I noticed:

{
  "reqId": "sGLjn5exo0J3ZAYR8Fde",
  "level": 3,
  "time": "2024-05-07T04:29:39+00:00",
  "remoteAddr": "10.42.0.1",
  "user": "--",
  "app": "PHP",
  "method": "GET",
  "url": "/status.php",
  "message": "session_start(): Redis connection not available at /var/www/html/lib/private/Session/Internal.php#213",
  "userAgent": "kube-probe/1.28",
  "version": "29.0.0.19",
  "data": {
    "app": "PHP"
  }
}

Again, within 60 seconds of the pods starting. Does not repeat even after multiple login/logout.

values.yml used:

redis:
    enabled: true

EDIT: I can replicate the issue when using secret-based auth. WIP

Kaurin commented 4 months ago

Welp. I don't know if the issue is the same as OPs, but initially my redis password contained quite a few special characters.

Before doing anything else, I changed that password to non-special chars. Uninstall, clean PVCs and Volumes, install again without changing values.yml, and it works fine.

Current, working values.yml

redis:
    enabled: true
    usePassword: true
    auth:
      existingSecret: nextcloud-redis
      existingSecretPasswordKey: redisPw

Note that I still get OP's error message before redis is ready

ensignavenger commented 1 month ago

Getting the same error but using Docker Compose instead of Kubernetes.

My log:

{"reqId":"CsmaTyEdjJhymt1vrdQM",
"level":3,
"time":"2024-08-16T16:19:16+00:00",
"remoteAddr":"myipaddress",
"user":"--",
"app":"PHP",
"method":"POST",
"url":"/",
"message":"session_start(): Failed to read session data: redis (path: tcp://valkey:6379?auth=cIQCrandomcharspassword) at /var/www/html/lib/private/Session/Internal.php#214",
"userAgent":"Mozilla/5.0 (X11; Linux x86_64; rv:128.0) Gecko/20100101 Firefox/128.0",
"version":"29.0.4.1",
"data":{"app":"PHP"}}

My compose file:

services:
  postgres:
    image: postgres:16.4
    networks:
      - nextcloud
    expose:
      - "5433"
    volumes:
      - ${DOCKER_VOLUMES}/cloud/postgres/data:/var/lib/postgresql/data
    environment:
      - POSTGRES_DB_FILE=/run/secrets/postgres_db
      - POSTGRES_USER_FILE=/run/secrets/postgres_user
      - POSTGRES_PASSWORD_FILE=/run/secrets/postgres_password
    secrets:
      - postgres_password
      - postgres_user
      - postgres_db

  nextcloud:
    image: nextcloud:29.0.4
    expose:
      - 80
    networks:
      - proxy_web
      - nextcloud
    volumes:
      - ${DOCKER_VOLUMES}/cloud/nextcloud/html:/var/www/html
    environment:
      - POSTGRES_HOST=postgres
      - POSTGRES_DB_FILE=/run/secrets/postgres_db
      - POSTGRES_USER_FILE=/run/secrets/postgres_user
      - POSTGRES_PASSWORD_FILE=/run/secrets/postgres_password
      - REDIS_HOST=valkey
      - REDIS_HOST_PORT=6379
      - REDIS_HOST_PASSWORD_FILE=/run/secrets/valkey_password
    depends_on:
      - postgres
      - valkey
    secrets:
      - postgres_password
      - postgres_user
      - postgres_db
      - valkey_password
    labels:
      - "traefik.enable=true"
      - "traefik.docker.network=proxy_web"
      - "traefik.http.routers.nextcloud.rule=Host(`mydomainname`)"
      - "traefik.http.routers.nextcloud.entrypoints=websecure"
      - "traefik.http.routers.nextcloud.tls.certresolver=mytlschallenge"
      - "traefik.http.services.nextcloud.loadBalancer.server.port=80"

  valkey: 
    # image: valkey/valkey:8.0
    image: redis
    expose: 
      - 6379
    networks:
      - nextcloud
    volumes:
      - ${DOCKER_VOLUMES}/cloud/valkey/data:/data

  utils:
    image: nicolaka/netshoot
    command: tail -f /dev/null
    networks:
      - nextcloud
      - proxy_web

secrets:
  nextcloud_admin_password:
    file: .secrets/nextcloud_admin_password.txt
  postgres_password:
    file: .secrets/postgres_password.txt
  postgres_db:
    file: .secrets/postgres_db.txt
  postgres_user:
    file: .secrets/postgres_user.txt 
  valkey_password:
    file: .secrets/valkey_password.txt 

networks:
  proxy_web:
    external: true
  nextcloud:

I use the netshoot container to verify the docker netowrk is working and I can access the redis container with the "valkey" hostname.

Before accessing the site, I give it plenty of time (several minutes) to ensuire everything is started. I wipe the volumes after making any changes (like from valkey to redis) to setup everything fresh.

I verified my password does not contain any special characters (lowercase, uppercase letters and numbers only, no symbols).

Note: I initially set it up to use valkey, an open source fork of redis that should be pretty "drop in" at this point, because redis is now closed source software. However, after having issues, I swapped it out for the official redis image for troubleshooting.

ensignavenger commented 1 month ago
env | grep REDIS_HOST

I followed this suggestion on my setup, and confirmed the envars are set.

REDIS_HOST_PORT=6379
REDIS_HOST_PASSWORD_FILE=/run/secrets/valkey_password
REDIS_HOST=valkey
jessebot commented 1 week ago

@ensignavenger I edited your comments just a bit for syntax highlighting and formatting, so I could read them better. Since you're using docker compose, could you please ask about this in the https://github.com/nextcloud/docker repo? This repo is for the helm chart only. Please feel free to cross link your issue here, so others can benefit from your legwork though.

I'm also not sure that valkey is 100% supported yet, but I am all here for it 🙏