rancher / os

Tiny Linux distro that runs the entire OS as Docker containers
https://rancher.com/docs/os/v1.x/en/
Apache License 2.0
6.44k stars 656 forks source link

Problems Mounting NFS volumes #2456

Open MaxDiOrio opened 6 years ago

MaxDiOrio commented 6 years ago

Rancher OS amd64 4.14.32-rancher2 RancherOS v1.4.0 VMWare

Sometimes NFS volumes mount perfectly fine, other times...

MountVolume.SetUp failed for volume “pvc-dba9311c-a7c4-11e8-b39a-00505685234f” : mount failed: exit status 32 Mounting command: mount Mounting arguments: -t nfs la-6pnasvmnfs02.internal.ieeeglobalspec.com:/k8s_vols/qt_cluster_max_test/zabbix-zabbix-4-web-nginx-nfs-pvc-dba9311c-a7c4-11e8-b39a-00505685234f /opt/rke/var/lib/kubelet/pods/46471c56-a7cc-11e8-b39a-00505685234f/volumes/kubernetes.io~nfs/pvc-dba9311c-a7c4-11e8-b39a-00505685234f Output: mount.nfs: rpc.statd is not running but is required for remote locking. mount.nfs: Either use ‘-o nolock’ to keep locks local, or start statd.

If I set a mount option of nolock on the volume:

MountVolume.SetUp failed for volume "pvc-b9b65f81-a7cf-11e8-b39a-00505685234f" : mount failed: exit status 32 Mounting command: mount Mounting arguments: -t nfs -o nolock la-6pnasvmnfs02.internal.ieeeglobalspec.com:/k8s_vols/qt_cluster_max_test/zabbix-zabbix-4-nginx1-nfs-pvc-b9b65f81-a7cf-11e8-b39a-00505685234f /opt/rke/var/lib/kubelet/pods/a9d7d0ad-a7d5-11e8-b39a-00505685234f/volumes/kubernetes.io~nfs/pvc-b9b65f81-a7cf-11e8-b39a-00505685234f Output: mount.nfs: access denied by server while mounting la-6pnasvmnfs02.internal.ieeeglobalspec.com:/k8s_vols/qt_cluster_max_test/zabbix-zabbix-4-nginx1-nfs-pvc-b9b65f81-a7cf-11e8-b39a-00505685234f

This is after rebooting the worker node as well. Two other pods with PVC's on the same NFS export are working well.

I was able to successfully bring up the PVC (read/write many) on the pod once. I then tried to scale out and add a node, then got the above error on the second pod. After trying to scale back, it hung on deleting the pod - I had to manually kill it. I then rebooted the worker node and got the first error immediately upon restart.

Doing a mount and grepping for the volume shows nothing mounted.

Trying to mount it from the command line fails with permission denied or

[rancher@la-1tkube-w2 docker]$ sudo mount -t nfs -o nolock la-6pnasvmnfs02.internal.ieeeglobalspec.com:/k8s_vols/qt_cluster_max_test/zabbix-zabbix-4-nginx1-nfs-pvc-b9b65f81-a7cf-11e8-b39a-00505685234f /opt/rke/var/lib/kubelet/pods/a9d7d0ad-a7d5-11e8-b39a-00505685234f/volumes/kubernetes.io~nfs/pvc-b9b65f81-a7cf-11e8-b39a-00505685234f
mount: mounting la-6pnasvmnfs02.internal.ieeeglobalspec.com:/k8s_vols/qt_cluster_max_test/zabbix-zabbix-4-nginx1-nfs-pvc-b9b65f81-a7cf-11e8-b39a-00505685234f on /opt/rke/var/lib/kubelet/pods/a9d7d0ad-a7d5-11e8-b39a-00505685234f/volumes/kubernetes.io~nfs/pvc-b9b65f81-a7cf-11e8-b39a-00505685234f failed: No such file or directory

Any ideas? This doesn't seem like a usable option for persistent storage right now.

rootwuj commented 6 years ago

@niusmallnan I reproduced this issue.

  1. I launch RancherOS-1.4.0 from AWS and use Rancher2.0.8 add cluster .
  2. Use nfs as the storage, create a statefulset or deployment, and it works fine.
  3. Then, reboot the node, the pod status is error and the above error message appears. I tried to scale out and didn't succeed, but I can successfully create a new statefulset or deployment.

I did the same thing on the ubuntu system, and the reboot won't have any effect.

niusmallnan commented 6 years ago

@rootwuj Is there still a problem switching to the Ubuntu console? I suspect that it is caused by the default console cleanup data after reboot.

rootwuj commented 6 years ago

@niusmallnan Yeah, I am testing on the ubuntu console. I will have this issue when I reboot node.

mshivanna commented 5 years ago

Have been facing the same issue on proxmox vms.

kingsd041 commented 5 years ago

Hi @mshivanna I can't reproduce it, Can you give more detailed steps?