PVC Mount to app pod is failing

rmadaka commented 5 years ago

Describe the bug PVC Mount failing on app pods with/with-out brick-mux enabled

Steps to reproduce Steps to reproduce the behavior:

-> Created 1GB PVC -> PVC Created successfully and pvc got bound successfully. -> Then tried to create app pod with pvc mount

vagrant@kube1 ~]$ kubectl get pods
NAME     READY   STATUS              RESTARTS   AGE
redis2   0/1     ContainerCreating   0          19m

-> Then verified app pod describe , below are the error messages found.

  Type     Reason                  Age    From                     Message
  ----     ------                  ----   ----                     -------
  Normal   Scheduled               8m46s  default-scheduler        Successfully assigned default/redis2 to kube3
  Normal   SuccessfulAttachVolume  8m45s  attachdetach-controller  AttachVolume.Attach succeeded for volume "pvc-257a65b7-02b3-11e9-b39f-5254006ba3b4"
  Warning  FailedMount             8m42s  kubelet, kube3           MountVolume.SetUp failed for volume "pvc-257a65b7-02b3-11e9-b39f-5254006ba3b4" : rpc error: code = Internal desc = mount failed: exit status 1
Mounting command: mount
Mounting arguments: -t glusterfs gluster-kube1-0.glusterd2.gcs:pvc-257a65b7-02b3-11e9-b39f-5254006ba3b4 /var/lib/kubelet/pods/87a58f40-02b3-11e9-b39f-5254006ba3b4/volumes/kubernetes.io~csi/pvc-257a65b7-02b3-11e9-b39f-5254006ba3b4/mount
Output: WARNING: getfattr not found, certain checks will be skipped..
Mount failed. Please check the log file for more details.
  Warning  FailedMount  2m11s (x3 over 6m42s)  kubelet, kube3  Unable to mount volumes for pod "redis2_default(87a58f40-02b3-11e9-b39f-5254006ba3b4)": timeout expired waiting for volumes to attach or mount for pod "default"/"redis2". list of unmounted volumes=[gcscsi]. list of unattached volumes=[gcscsi default-token-qgwkr]
  Warning  FailedMount  28s (x11 over 8m41s)   kubelet, kube3  MountVolume.SetUp failed for volume "pvc-257a65b7-02b3-11e9-b39f-5254006ba3b4" : stat /var/lib/kubelet/pods/87a58f40-02b3-11e9-b39f-5254006ba3b4/volumes/kubernetes.io~csi/pvc-257a65b7-02b3-11e9-b39f-5254006ba3b4/mount: transport endpoint is not connected

Actual results

vagrant@kube1 ~]$ kubectl get pods
NAME     READY   STATUS              RESTARTS   AGE
redis2   0/1     ContainerCreating   0          19m

Events:
  Type     Reason                  Age    From                     Message
  ----     ------                  ----   ----                     -------
  Normal   Scheduled               8m46s  default-scheduler        Successfully assigned default/redis2 to kube3
  Normal   SuccessfulAttachVolume  8m45s  attachdetach-controller  AttachVolume.Attach succeeded for volume "pvc-257a65b7-02b3-11e9-b39f-5254006ba3b4"
  Warning  FailedMount             8m42s  kubelet, kube3           MountVolume.SetUp failed for volume "pvc-257a65b7-02b3-11e9-b39f-5254006ba3b4" : rpc error: code = Internal desc = mount failed: exit status 1
Mounting command: mount
Mounting arguments: -t glusterfs gluster-kube1-0.glusterd2.gcs:pvc-257a65b7-02b3-11e9-b39f-5254006ba3b4 /var/lib/kubelet/pods/87a58f40-02b3-11e9-b39f-5254006ba3b4/volumes/kubernetes.io~csi/pvc-257a65b7-02b3-11e9-b39f-5254006ba3b4/mount
Output: WARNING: getfattr not found, certain checks will be skipped..
Mount failed. Please check the log file for more details.
  Warning  FailedMount  2m11s (x3 over 6m42s)  kubelet, kube3  Unable to mount volumes for pod "redis2_default(87a58f40-02b3-11e9-b39f-5254006ba3b4)": timeout expired waiting for volumes to attach or mount for pod "default"/"redis2". list of unmounted volumes=[gcscsi]. list of unattached volumes=[gcscsi default-token-qgwkr]
  Warning  FailedMount  28s (x11 over 8m41s)   kubelet, kube3  MountVolume.SetUp failed for volume "pvc-257a65b7-02b3-11e9-b39f-5254006ba3b4" : stat /var/lib/kubelet/pods/87a58f40-02b3-11e9-b39f-5254006ba3b4/volumes/kubernetes.io~csi/pvc-257a65b7-02b3-11e9-b39f-5254006ba3b4/mount: transport endpoint is not connected

Expected behavior PVC should get mount successfully to the app pod

Additional context Add any other context about the problem here.

amarts commented 5 years ago

@Madhu-1 I remember we had this bug... needed to install getfattr tool? What else?

Madhu-1 commented 5 years ago

@amarts am not sure is this related to getfattr tool, even if the tool is missing we are able to mount the volumes earlier, need to check what causes transport endpoint is not connected error.

Madhu-1 commented 5 years ago

Mount logs from gluster-nodeplugin

[2018-12-18 10:27:57.413639] I [MSGID: 100030] [glusterfsd.c:2741:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 4.1.5 (args: /usr/sbin/glusterfs --process-name fuse --volfile-server=gluster-kube1-0.glusterd2.gcs --volfile-id=pvc-5b927ac6-02aa-11e9-b39f-5254006ba3b4 /var/lib/kubelet/pods/950e2124-02af-11e9-b39f-5254006ba3b4/volumes/kubernetes.io~csi/pvc-5b927ac6-02aa-11e9-b39f-5254006ba3b4/mount)
[2018-12-18 10:27:57.435476] W [MSGID: 101012] [common-utils.c:3139:gf_get_reserved_ports] 0-glusterfs: could not open the file /proc/sys/net/ipv4/ip_local_reserved_ports for getting reserved ports info [No such file or directory]
[2018-12-18 10:27:57.435588] W [MSGID: 101081] [common-utils.c:3176:gf_process_reserved_ports] 0-glusterfs: Not able to get reserved ports, hence there is a possibility that glusterfs may consume reserved port
[2018-12-18 10:27:57.436312] I [MSGID: 101190] [event-epoll.c:617:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1
[2018-12-18 10:27:57.450551] I [glusterfsd-mgmt.c:1923:mgmt_getspec_cbk] 0-glusterfs: Received list of available volfile servers: gluster-kube1-0.glusterd2.gcs:24007 gluster-kube3-0.glusterd2.gcs:24007 gluster-kube2-0.glusterd2.gcs:24007
pending frames:
frame : type(0) op(0)
patchset: git://git.gluster.org/glusterfs.git
signal received: 11
time of crash: 
2018-12-18 10:27:57
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 4.1.5
/lib64/libglusterfs.so.0(+0x25940)[0x7f8abbf14940]
/lib64/libglusterfs.so.0(gf_print_trace+0x334)[0x7f8abbf1e894]
/lib64/libc.so.6(+0x362f0)[0x7f8aba5792f0]
/lib64/libc.so.6(+0xcce88)[0x7f8aba60fe88]
/lib64/libc.so.6(fnmatch+0x63)[0x7f8aba611563]
/lib64/libglusterfs.so.0(xlator_volume_option_get_list+0x42)[0x7f8abbf6d7d2]
/lib64/libglusterfs.so.0(+0x7e85f)[0x7f8abbf6d85f]
/lib64/libglusterfs.so.0(dict_foreach_match+0x77)[0x7f8abbf0b557]
/lib64/libglusterfs.so.0(dict_foreach+0x18)[0x7f8abbf0b708]
/lib64/libglusterfs.so.0(xlator_options_validate_list+0x3f)[0x7f8abbf6da2f]
/lib64/libglusterfs.so.0(xlator_options_validate+0x41)[0x7f8abbf6dab1]
/lib64/libglusterfs.so.0(+0x5f999)[0x7f8abbf4e999]
/lib64/libglusterfs.so.0(glusterfs_graph_activate+0x28)[0x7f8abbf4f358]
/usr/sbin/glusterfs(glusterfs_process_volfp+0x119)[0x56359c42c639]
/usr/sbin/glusterfs(mgmt_getspec_cbk+0x6da)[0x56359c432fba]
/lib64/libgfrpc.so.0(+0xec20)[0x7f8abbce1c20]
/lib64/libgfrpc.so.0(+0xefb3)[0x7f8abbce1fb3]
/lib64/libgfrpc.so.0(rpc_transport_notify+0x23)[0x7f8abbcdde93]
/usr/lib64/glusterfs/4.1.5/rpc-transport/socket.so(+0x7636)[0x7f8ab6df1636]
/usr/lib64/glusterfs/4.1.5/rpc-transport/socket.so(+0xa107)[0x7f8ab6df4107]
/lib64/libglusterfs.so.0(+0x890b4)[0x7f8abbf780b4]
/lib64/libpthread.so.0(+0x7e25)[0x7f8abad78e25]
/lib64/libc.so.6(clone+0x6d)[0x7f8aba641bad]
---------

[root@csi-nodeplugin-glusterfsplugin-fbskt /]# rpm -qa |grep gluster
centos-release-gluster41-1.0-3.el7.centos.noarch
glusterfs-libs-4.1.5-1.el7.x86_64
glusterfs-4.1.5-1.el7.x86_64
glusterfs-client-xlators-4.1.5-1.el7.x86_64
glusterfs-fuse-4.1.5-1.el7.x86_64

RCA: due to using the older glusterfs-fuse we are getting this issue, need to update the CSI driver to use nightly built gluster package

@amarts @aravindavk Thanks for helping to find the actual issue.

gluster / gluster-csi-driver

PVC Mount to app pod is failing #123