Closed mikeyGlitz closed 3 years ago
I have also been trying to get this install to work with a PV and PVC and no luck, If I do it without a PVC and PV it works, as soon as I enable the PV, it says nextcloud directory isn't found, so I make the directory. Then it says "Error: failed to create subPath directory for volumeMount "nextcloud-data" of container "nextcloud"". does anyone have any ideas about this?
I am having the same issue. I also use nfs-client
as storageClass, which might cause this bug? IIRC I used a manual created PV some time back and it worked.
Have you figured out how to make this work?
Not sure if we are having the same issue, but I will detail my investigation so far on trying to use persistence.existingClaim
, in case it helps people progress in their own investigations and/or if the context would help someone more knowledgeable provide some help as I have only worked with k8s for a year or so.
From what I could see, the container creation process errors out with:
Error: failed to start container "nextcloud": Error response from daemon: OCI runtime create failed: container_linux.go:349: starting container process caused "process_linux.go:449: container init caused \"rootfs_linux.go:58: mounting \\\"/var/lib/kubelet/pods/49c19090-14d6-4bee-b774-ca24b0ddd259/volume-subpaths/jun30third-nextcloud-data-pv/nextcloud/0\\\" to rootfs \\\"/var/lib/docker/overlay2/40dca10bcad3a57d61d35d40d0bd897f6d2322c3a5d9f615d2a90a38d7fe4cd5/merged\\\" at \\\"/var/lib/docker/overlay2/40dca10bcad3a57d61d35d40d0bd897f6d2322c3a5d9f615d2a90a38d7fe4cd5/merged/var/www\\\" caused \\\"no such file or directory\\\"\"": unknown
I've looked on the node during the time of the directory creation, and some things to note:
(.*)/merged
is created when the container is being spun up, but I could never see merged
directory inside (although I didn't have the container ID beforehand so relied on watch
commands and manually looking on the node, so I can't guarantee that it was never there, I just know I could never see it there)The only lead I've found so far to why this might be happening is https://github.com/kubernetes/kubernetes/issues/61545#issuecomment-465887014 and the following comment links https://github.com/kubernetes/kubernetes/issues/61563#issuecomment-428364190. My guess is that this is related to the second issue in the last comment (i.e. https://github.com/kubernetes/kubernetes/issues/61545), given that the config mounts are nested inside the directory mount, however given that the error is on subpath /nextcloud/0
of the container (which I have verified is the root
subpath), this might not be true but is my best lead so far.
I'm currently poking by manually changing specifications to see if any configuration works (i.e. trying different variations of the mountpaths nesting to see if I can get it to start up manually before figuring out how to correct the chart), but in the meantime if anyone else finds a solution and/or if it seems I'm going down the wrong trail, please let me know!
Update: it is not the configmap causing this in my case, it's the nested mounts: https://github.com/helm/charts/blob/master/stable/nextcloud/templates/deployment.yaml#L289 Additionally, the problem only appears after the first restart (it seems that the first time it can do the mounting, but once things get written to the volumes and the container restarts, the bind mounts fail for the new container with the above error). This problem might be specific to our storage class (we're using an RClone CSI which fuse-mounts an s3 bucket) and different from yours, although I haven't tried it with a nfs layer on top yet to confirm. This does seem to be different than what you're seeing though... (sorry for hijacking your issue).
In case this comes up for anyone else current workaround is keeping only the root directory mount (which is enough to backup everything else as they are nested) and that seems to fix the problem
Okay I got it working! I am using a Open Media vault NFS share for all of my persistent volumes. I set them up with the following settings and it now works without any issues when using the regular helm install, no extra stuff required.
Settings for nfs share.
It also works with nfs-server-provisioner
(https://github.com/helm/charts/tree/master/stable/nfs-server-provisioner) with expected values.
Specific values we're using
helm install nfs-provisioner stable/nfs-server-provisioner
--namespace myns
--set persistence.enabled=true
--set persistence.storageClass="ebs"
--set persistence.size=100Gi
--set storageClass.create=true
--set storageClass.reclaimPolicy="Delete"
--set storageClass.allowVolumeExpansion=true
and NextCloud
snippet:
persistence:
enabled: true
storageClass: nfs
accessMode: "ReadWriteMany"
I'll open a separate issue for the existingClaim
problem
Okay I got it working! I am using a Open Media vault NFS share for all of my persistent volumes. I set them up with the following settings and it now works without any issues when using the regular helm install, no extra stuff required.
Settings for nfs share.
- rw,no_root_squash,insecure,async,no_subtree_check,anonuid=1000,anongid=1000
Tried changing the line in my /etc/exports
and it didn't fix the problem.
Using the following snippets:
nfs-client-provisioner.values.yaml
nfs:
mountOptions:
- nfsvers=4
server: 172.16.0.1
nfs.path: /mnt/external
I updated my nextcloud values with the new value persistence.accessMode=ReadWriteMany
.
Also didn't work.
I have the following directories in my volume:
drwxrwxrwx 9 root root 4096 Jul 4 01:04 ./
drwxr-xr-x 7 root root 4096 Jul 4 01:09 ../
drwxrwxrwx 2 root root 4096 Jul 4 01:04 config/
drwxrwxrwx 2 root root 4096 Jul 4 01:04 custom_apps/
drwxrwxrwx 2 root root 4096 Jul 4 01:04 data/
drwxrwxrwx 8 www-data root 4096 Jul 4 01:08 html/
drwxrwxrwx 4 root root 4096 Jul 4 01:04 root/
drwxrwxrwx 2 root root 4096 Jul 4 01:04 themes/
drwxrwxrwx 2 root root 4096 Jul 4 01:04 tmp/
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.
Got the same problem. Tested with version 17.0.0-apache and 19.0.1-apache. Also seeing that the dirs are root:root. When we deploy without PVC the installation works
Using nfs-client-provisioner works but the main problem is that the initial rsync takes arround 5 minutes to complete (at least in my tests using GCP Filestore). You can look at the entrypoint.sh file.
rsync -rlDog --chown www-data:root --delete --exclude-from=/upgrade.exclude /usr/src/nextcloud/ /var/www/html/
If you disable the readiness and the liveness in the values, it works.
❯ k logs nextcloud-5756597dbc-nhg5m Initializing nextcloud 17.0.8.1 ... Initializing finished New nextcloud instance Installing with PostgreSQL database starting nextcloud installation Nextcloud was successfully installed setting trusted domains… System config value trusted_domains => 1 set to string XXXXXXXX AH00558: apache2: Could not reliably determine the server's fully qualified domain name, using 10.192.149.41. Set the 'ServerName' directive globally to suppress this message AH00558: apache2: Could not reliably determine the server's fully qualified domain name, using 10.192.149.41. Set the 'ServerName' directive globally to suppress this message [Tue Aug 11 08:54:50.097547 2020] [mpm_prefork:notice] [pid 1] AH00163: Apache/2.4.38 (Debian) PHP/7.3.21 configured -- resuming normal operations [Tue Aug 11 08:54:50.097621 2020] [core:notice] [pid 1] AH00094: Command line: 'apache2 -D FOREGROUND'
I've trying some alternatives to that rsync but since there are a lot of small files to copy i haven't found any improvement.
Any ideas?
Log looks like stuck at Initializing Nextcloud 17.0.7...
, because the rsync process extremely slow (for my local nfs, it is about 1.5MB/s, show the progress with rsync --info=progress2
). More worsely, the liveness probe will continuously fail and finally get CrashLoopBackOff.
As the workaround like jesussancheztellomm, I disable the liveness probe on first installation, and enable it after finishing the installation.
Maybe we can refer to nextcloud/docker#968 It will not solve the problem of slow nfs transmission speed (I still have no idea why ...), but stateless application may remove the rsync process.
@timtorChen i can confirm, when i disabled the LivinessProbe it took 11min to sync. Also tried it with an S3 Storage backend, it took just seconds to sync.
So i looked deeper in my NFS, and we are using SYNC instead of ASYNC because we want not lose any data. I didnt test it with an ASYNC connection.
The nextcloud chart has migrated to a new repo. Can you please raise the issue over there? https://github.com/nextcloud/helm
Opened an incident over on the new repo. Tried to summarize some of the info from this discussion.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.
This issue is being automatically closed due to inactivity.
Describe the bug
When the helm chart is bringing up NextCloud, the application does not get past the log message
Version of Helm and Kubernetes:
helm: v3.2.1 kubernetes: v1.18.4+k3s1
Which chart:
stable/nextcloud
What happened:
Namespace is created. Helm creates persistent-volume-claim Helm instantiates MariaDB using bitnami/mariadb chart Helm instantiates Nextcloud container Nextcloud container starts Nextcloud container does not get past
What you expected to happen:
Nextcloud was supposed to finish initialization Nextcloud files were supposed to be copied with correct permissions to the PVC
How to reproduce it (as minimally and precisely as possible):
Initialize helm with the following:
values.yaml