Closed ook closed 4 years ago
FYI, I upgraded glusterfs-client to glusterfs 8 on another node (ubuntu18.04 glusterfs-client=8.0-ubuntu1~bionic9) and the error doesn't occur: it's fixed into glusterfs8
I recommand to backport the fix into Gluster7.
I have tried to reproduce the issue somehow i am not able to reproduce it, i have used below command to mount the volume. mount -t glusterfs -o auto_unmount,backup-volfile-servers=10.70.42.246:10.70.41.154:10.70.42.63 10.70.41.154:/testvol /mnt Can you please share complete mount log file while the client was crashing.
Here the mountlog:
Please note that the error E [glusterfsd.c:834:gf_remember_backup_volfile_server] 0-glusterfs: failed to set volfile server: File exists
appears only on STDERR and not into the mountlog.
From the logs it is clear client was crashing because the script(/usr/sbin/mount.glusterfs) was not able to parsed the arguments and volfile-server value is NULL while calling a process, i am not able to find any change recently merged in this script.Can you please share the script available on your system while the client was crashing? --process-name fuse --volfile-server= --volfile-server= --volfile-server= --volfile-server= --volfile-server= --volfile-id=hek_default_irssi_c18fe342-a546-4f9e-93ca-fcc5988f5b99 --fuse-mountopts=auto_unmount /var/lib/kubelet/pods/946bd6ab-5323-40cd-9057-d553d92ce707/volumes/
or you can compare yourself the differences between the script glusterfs(7/8) version
Here the script (from /sbin/mount.glusterfs) (extension txt added to bypass GH restriction)
I confirm than dropping /sbin/mount.glusterfs
from v8 in place of v7 remove the error. Checking it solves the problem even at k8s level.
It seems this is the patch (https://review.gluster.org/#/c/glusterfs/+/22884/) that we changed the server_ip parsed command but the patch is not merged in release-7. Can you try to replace the server_ip command in your release-7 mount script and see the behavior. server_ip=$(echo "$volfile_loc" | sed -n 's/([a-zA-Z0-9:%,.-]):./\1/p');
I tried by adding the comma and it didn't remove the error. That's another part of the script that "solve" the issue.
May be this-one https://review.gluster.org/#/c/glusterfs/+/24157/
Daf*q!
Sorry I made a mistake and didn't check the return value: with mount.gluster from v8, it fails (error 1) and need to set log level to debug to get something into mount log file:
pending frames:
frame : type(0) op(0)
patchset: git://git.gluster.org/glusterfs.git
signal received: 11
time of crash:
2020-08-13 11:16:56
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 4.1.5
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(+0x237aa)[0x7fce624ee7aa]
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(gf_print_trace+0x2f7)[0x7fce624f8517]
/lib/x86_64-linux-gnu/libc.so.6(+0x33060)[0x7fce60b40060]
/lib/x86_64-linux-gnu/libc.so.6(+0xbf944)[0x7fce60bcc944]
/lib/x86_64-linux-gnu/libc.so.6(fnmatch+0x61)[0x7fce60bcde41]
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(xlator_volume_option_get_list+0x35)[0x7fce62541905]
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(+0x7698e)[0x7fce6254198e]
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(dict_foreach_match+0x87)[0x7fce624e5b97]
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(dict_foreach+0x18)[0x7fce624e5d78]
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(xlator_options_validate_list+0x3f)[0x7fce62541b4f]
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(xlator_options_validate+0x39)[0x7fce62541bc9]
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(+0x5ab99)[0x7fce62525b99]
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(glusterfs_graph_activate+0x24)[0x7fce625264d4]
/usr/sbin/glusterfs(glusterfs_process_volfp+0x100)[0x562bfd89ff40]
/usr/sbin/glusterfs(mgmt_getspec_cbk+0x6bd)[0x562bfd8a689d]
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(+0xe110)[0x7fce622bd110]
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(+0xe44f)[0x7fce622bd44f]
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_transport_notify+0x23)[0x7fce622b9863]
/usr/lib/x86_64-linux-gnu/glusterfs/4.1.5/rpc-transport/socket.so(+0x6cfb)[0x7fce5d3bacfb]
/usr/lib/x86_64-linux-gnu/glusterfs/4.1.5/rpc-transport/socket.so(+0x9605)[0x7fce5d3bd605]
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(+0x8052e)[0x7fce6254b52e]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x74a4)[0x7fce6134d4a4]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x3f)[0x7fce60bf5d0f]
---------
So script from v8 doesn't fix the problem on v7.
For the time being you can mount the volume after call /usr/sbin/glusterfs directly like this /usr/sbin/glusterfs --log-level=INFO --log-file=/var/lib/kubelet/plugins/kubernetes.io/glusterfs/pvc-46fbdf01-92ec-4a4f-acd3-328c1c5ed9d6/irssi-74bd8cc7d5-2xrqz-glusterfs.log --fuse-mountopts=auto_unmount --process-name fuse --volfile-server=IP1 --volfile-server=IP2 --volfile-server=IP3 --volfile-server=IP4 --volfile-id=hek_default_irssi_c18fe342-a546-4f9e-93ca-fcc5988f5b99 --fuse-mountopts=auto_unmount /var/lib/kubelet/pods/946bd6ab-5323-40cd-9057-d553d92ce707/volumes/ kubernetes.io~glusterfs/pvc-46fbdf01-92ec-4a4f-acd3-328c1c5ed9d6, In my environment the script is working fine.I will ask more info to troubleshoot it.
Thank you. I'll try to make the script spit out your snippet.
Now I have no problem mounting volume by hand (via /sbin/mount.glusterfs or directly via mount)
From the logs it is clear client was crashing because the script(/usr/sbin/mount.glusterfs)
On my side I just understood that k8s is not using the /sbin/mount.glusterfs script at all :( (I update to echo something and exit 42 /sbin/mount.glusterfs : I have the very same log in k8s) Do you know which command the glusterfs plugin run from inside the kubelet volume manager ?
I think this(https://github.com/gluster/gluster-kubernetes) would be helpful for you.
Thank you, I used that project before it became inactive: it's far too outdated now (stuck to glusterfs3). I was looking for https://github.com/kubernetes/kubernetes/blame/323f34858de18b862d43c40b2cced65ad8e24052/pkg/volume/glusterfs/glusterfs.go#L318 I'll post update here once a solution get found.
So I work around that problem by using project https://github.com/kadalu/kadalu which from my point of view deprecates project https://github.com/gluster/gluster-kubernetes :
Thank you for your help @mohit84 . Closing this one.
Description of problem:
When mounting a glusterfs volume from an IP X, with backup-volfile-servers option containing X, an error is emitted, even if the volume is correctly mounted. Logs display a crash.
Consequence: in a k8s cluster with glusterfs as pvc provider, it leads k8s consider PV mount a failure, and pod can't start.
The exact command to reproduce the issue:
Please note: IP X is used both in mount source and backup-volfile-servers. If you remove IP X from backup-volfile-servers, no error will occur. The mount command is generated by k8s pod operations.
The full output of the command that failed:
From mount logfile:
Expected results:
Volume mounted, without error or crash (backup-volfile-servers IP should be ignored when already in the mount source).
Additional info:
Gluster server 7.7 Gluster client 7.7
- The output of the
gluster volume info
command:- The operating system / glusterfs version:
Glusterfs server (all nodes): Centos7 / Glusterfs 7.7 Client: Ubuntu 16.04 / Glusterfs-client 7.7-ubuntu1~xenial2