nextcloud / helm

A community maintained helm chart for deploying Nextcloud on Kubernetes.
GNU Affero General Public License v3.0
295 stars 258 forks source link

No login if more than one replica - CHART nextcloud-4.5.2 - APP VERSION 27.1.3 #488

Closed supermario18b closed 7 months ago

supermario18b commented 7 months ago

Hi everyone,

I've installed nextcloud via helm. I use an external db (postgresql cluster set with zalando pg operator). It works well if I use only one replica. If I set the replicas to 2 or more, I can't login anymore and I don't see any error.

Do you have any suggestion?

Thanks in advance,

supermario18b

jessebot commented 7 months ago

I don't think I've ever made this chart work with more than one replica, but I'm marking this as "help wanted" in case anyone else has, as I'd love to know how you did it. Could you check the logs of the pod and then also setup the nextcloud.log file to get some debug info?

supermario18b commented 7 months ago

I set the loglevel to 0 and I noticed that when I use more replicas, after the login POST request a GET request is executed:

"GET /login?direct=1&user=nextcloudadmin HTTP/1.1" 200

That GET request to the login page is not present when everything is working fine (one replica).

supermario18b commented 7 months ago

I've got it working with more replicas. I use traefik as loadbalancer and I've set "sticky session" for the nextcloud ingressroute:

services:                       
  - kind: Service
    name: nextcloud
    port: 8080
    sticky:
      cookie: {}
provokateurin commented 7 months ago

Could you add some documentation about this? Then it will be easier for other people to get it working as well :)

supermario18b commented 7 months ago

How to install nextcloud via helm is already documented. I don't think everybody use traefik as reverse proxy and I've already posted the relevant part of the "ingressroute" resource definition, by the way here is the complete ingressroute definition:

apiVersion: traefik.io/v1alpha1
kind: IngressRoute
metadata:
  name: nextcloud-route
spec:
  entryPoints:
    - websecure
  routes:
  - match: Host(`nextcloud.yourdomain.org`)
    kind: Rule
    services:                       
      - kind: Service
        name: nextcloud
        port: 8080
        sticky:
          cookie: {}
jessebot commented 7 months ago

I think @provokateurin was just suggesting you submit a PR to help others :) You could add a section here: https://github.com/nextcloud/helm/blob/main/charts/nextcloud/README.md#quick-links

And you could call it something like "two or more replicas with treafik", and then just add what you've learned here, including the complete IngressRoute you posted here for others that have the same issue.

Adding docs is totally optional, but something we encourage community members to do so that others can get rolling faster. Most of our docs have been created from things people have learned along the way :)

Also, made minor edits for spacing and also syntax highlighting to your posts. Example on how to add syntax highlighting in markdown for issues:

```yaml
apiVersion: traefik.io/v1alpha1
kind: IngressRoute
```            
supermario18b commented 7 months ago

I didn't understand what @provokateurin meant. I've created a pull request:

https://github.com/nextcloud/helm/pull/492

Please @jessebot and @provokateurin feel free to edit that. I'm not sure that I've set the "quick link" well.

2fst4u commented 6 months ago

Omg it's taken me days of research and troubleshooting to find this issue, you have the exact same experience as I have had so far, the helm makes it sound like increasing replicas is simple and all you need is a PV with ReadWriteMany permission but it's more involved than that.

I found sticky sessions as an option too, but I don't think it's actually a solution, I consider that a workaround.

Shouldn't redis be handling sessions? Is there something that needs to be configured in PHP.ini as per the docs here:

https://docs.nextcloud.com/server/latest/admin_manual/configuration_server/caching_configuration.html

"Using the Redis session handler

If you are using Redis for locking and/or caching, you may also wish to use Redis for session management. Redis can be used for centralized session management across multiple Nextcloud application servers, unlike the standard files handler. If you use the Redis handler, though, you MUST ensure that session locking is enabled. As of this writing, the Redis session handler does NOT enable session locking by default, which can lead to session corruption in some Nextcloud apps that make heavy use of session writes such as Talk. In addition, even when session locking is enabled, if the application fails to acquire a lock, the Redis session handler does not currently return an error. Adding the following settings in your php.ini file will prevent session corruption when using Redis as your session handler:"

redis.session.locking_enabled=1
redis.session.lock_retries=-1
redis.session.lock_wait_time=10000

If this is therefore the solution, should this issue be reopened so it can be tracked and tested? I can't make a pr because I'm not a coder, I have no idea how to make the fix.

2fst4u commented 6 months ago

A similar resolution is here:

https://github.com/nextcloud/helm/issues/173#issuecomment-1063819313

But it only provides example for redis cluster and doesn't seem to match what the nextcloud docs say.

I tried adding a redis php file to the helm values like they did, but using the code from the docs but it doesn't seem to have resolved the issue.