nextcloud using hpa deadlock

lion24 commented 2 years ago

Hi there,

I've setup PVC using "ReadWriteMany" accessMode for the nextcloud data directory over NFS. The NFS share is exposed by TrueNas and using democratic-csi driver for k8s: https://jonathangazeley.com/2021/01/05/using-truenas-to-provide-persistent-storage-for-kubernetes/

It seems while configuring nextcloud with the following values using multiple replicas:

replicaCount: 3
hpa:
  cputhreshold: 60
  enabled: true
  maxPods: 10
  minPods: 3

That there's a kind of deadlock between pods waiting each other releasing a lock file:

+ [ ! -f /var/www/html/nextcloud-init-sync.lock ]                                                                                                                                                                         
+ count=2
+ wait=20
+ echo Another process is initializing Nextcloud. Waiting 20 seconds...
+ sleep 20
Another process is initializing Nextcloud. Waiting 20 seconds...

Which never happened and pods after a while entering in a CrashLoopBack state and this preventing nextcloud to start properly.

My understanding is that the first pod launch will acquired the lock preventing the other pods to sync the html folder, one this pod finished syncing the html folder the initialization is assumed to be complete, would it be possible that this rsync task takes to much times are finish and times out?

I don't know if that rings a bell to someone ?

Cheers.

lion24 commented 2 years ago

Yeah. I know what's is going on. I increase the verbosity and added some more debug logs.

apps/files/js/dist/
apps/files/js/dist/main.js
        580,579 100%  362.98kB/s    0:00:01 (xfr#9506, ir-chk=1048/12720)
apps/files/js/dist/main.js.map
      2,060,484 100%  750.82kB/s    0:00:02 (xfr#9507, ir-chk=1047/12720)
apps/files/js/dist/personal-settings.js
<killed>

It seems that the rsync is taking too much time to sync the config folder from /usr/src/nextcloud into /var/www/html and get killed (by the orchestrator?) as the pods taking too much time to be in ready states. This is probably because of rsync + chown + NFS combo

I will look how I can tune my NFS share on TrueNAS and I'll post results here.

jcoulter commented 2 years ago

I am running into the same error with a much simpler setup on a vm I am trying to provision with ansible. I had it successfully running with /var/www/html mounting nfs on my nas and I broke it by trying to migrate my working docker-compose configuration to ansible managing the docker containers directly. I wonder if some file is locked in /var/www/html (I've blown away and reprovisioned just about everything but that mount and the maria db mount.

lion24 commented 2 years ago

@jcoulter Hello, yes this is clearly the NFS share the culprit + a lot of small files to sync. Once the docker container the first container spawned will acquire a lock (touch lock file) and other will wait for it to finish its jobs by monitoring the presence of this file. The issue is if the rsync did not finish in time, the container will get killed and the lock file will not be cleaned on the shared persistent storage, hence the process will never be completed.

This is probably not an issue with the network. In my case iperf3 report a sustainable 20 Gb/s since my Kubernetes pods have a direct access to the Nas network at the hypervisor level and not leaving the machine.
This is most probably not an issue with rsync.

Sending a big file using rsync on the NFS share is really a breeze and the bottleneck is clearly there on the SSD:

lionel@pve:~$ time rsync -rlDog --progress -v 5000M /mnt/pve/proxmox-fast/
sending incremental file list
5000M
  5,242,880,000 100%  618.20MB/s    0:00:08 (xfr#1, to-chk=0/1)

sent 5,244,160,101 bytes  received 35 bytes  616,960,016.00 bytes/sec
total size is 5,242,880,000  speedup is 1.00

real    0m8.420s
user    0m3.735s
sys     0m7.803s

The problem here is clearly NFS + rsync + lot of small files combo. Since there's a lot of files, rsync maybe does too much syscall and do too much context switching slowing down the copy process?

I keep digging down the rabbit hole 😀

nextcloud / helm

nextcloud using hpa deadlock #230