Closed jcresp21 closed 3 years ago
Thank you for your contributions. Noticed that this issue is not having any activity in last ~6 months! We are marking this issue as stale because it has not had recent activity. It will be closed in 2 weeks if no one responds with a comment here.
Closing this issue as there was no update since my last update on issue. If this is an issue which is still valid, feel free to open it.
Description of problem: The volume was originally a Replica 2 with an Arbiter, each brick located in a different node, As part of a reinstallation of one of the nodes ( it was upgraded to ubuntu Bionic) the volume was shrinked to remain with only 1 brick , the arbiter had to be removed also. After reinstalling the node, a new brick from this node was added to the volume, and also the arbiter from a different node, causing VM's with qcow2 files in the Gluster Volume to change all their FS to read-only. The VMs had to be rebooted, after that the problem stopped.
The exact command to reproduce the issue: -> Deletion of brick:
kill -15 <brick_process>
gluster volume remove-brick gv1 replica 1 gluster-2.xcade.net:/mnt/gv_gu2/newbrick gluster-1.xcade.net:/mnt/arbiter/arbiter_brick
From the deletion of the brick I can see on the logs the following Error, but VM's keep working without problems.E [fuse-bridge.c:227:check_and_dump_fuse_W] (--> /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x138)[0x7f3f7bf44848] (--> /usr/lib/x86_64-linux-gnu/glusterfs/7.5/xlator/mount/fuse.so(+0x7bda)[0x7f3f79867bda] (--> /usr/lib/x86_64-linux-gnu/glusterfs/7.5/xlator/mount/fuse.so(+0x7d35)[0x7f3f79867d35] (--> /lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba)[0x7f3f7b6ae6ba] (--> /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f3f7b3e441d] ))))) 0-glusterfs-fuse: writing to fuse device failed: No such file or directory
-> Re-Adding of bricks:
gluster volume add-brick gv2 replica 3 arbiter 1 gluster-2.xcade.net:/mnt/gv_gu2/newbrick gluster-1.xcade.net:/mnt/arbiter/arbiter_brick
The command didn't report any failure: Reported SUCCESS
- The output of the
gluster volume info
command: Volume Name: gv2 Type: Replicate Volume ID: 8cd2932f-44e4-421c-acec-69de2001f247 Status: Started Snapshot Count: 0 Number of Bricks: 1 x (2 + 1) = 3 Transport-type: tcp Bricks: Brick1: gluster-3.xcade.net:/mnt/gv_gu2/newbrick Brick2: gluster-2.xcade.net:/mnt/gv_gu2/newbrick Brick3: gluster-1.xcade.net:/mnt/arbiter/arbiter_brick (arbiter) Options Reconfigured: performance.client-io-threads: off nfs.disable: on storage.fips-mode-rchecksum: on transport.address-family: inet- The operating system / glusterfs version: 2 Nodes running with Ubuntu Xenial 16.04 1 Node (Gluster-2) running with Ubuntu Bionic 18.04 All nodes runnning latest gluster version 7.5
- Logs of operation -
[2020-05-12 14:31:22.031090] I [glusterfsd.c:2594:daemonize] 0-glusterfs: Pid of current running process is 7479 [2020-05-12 14:31:22.039603] I [MSGID: 114020] [client.c:2436:notify] 0-gv2-client-1: parent translators are ready, attempting connect on transport [2020-05-12 14:31:22.040260] I [MSGID: 114020] [client.c:2436:notify] 0-gv2-client-2: parent translators are ready, attempting connect on transport [2020-05-12 14:31:22.040507] I [MSGID: 114020] [client.c:2436:notify] 0-gv2-client-3: parent translators are ready, attempting connect on transport Final graph: +------------------------------------------------------------------------------+ 1: volume gv2-client-1 2: type protocol/client 3: option ping-timeout 42 4: option remote-host gluster-gu3.xcade.net 5: option remote-subvolume /mnt/gv_gu2/newbrick 6: option transport-type socket 7: option transport.address-family inet 8: option transport.socket.ssl-enabled off 9: option transport.tcp-user-timeout 0 10: option transport.socket.keepalive-time 20 11: option transport.socket.keepalive-interval 2 12: option transport.socket.keepalive-count 9 13: option send-gids true 14: end-volume 15: 16: volume gv2-client-2 17: type protocol/client 18: option ping-timeout 42 19: option remote-host gluster-gu2.xcade.net 20: option remote-subvolume /mnt/gv_gu2/newbrick 21: option transport-type socket 22: option transport.address-family inet 23: option transport.socket.ssl-enabled off 24: option transport.tcp-user-timeout 0 25: option transport.socket.keepalive-time 20 26: option transport.socket.keepalive-interval 2 27: option transport.socket.keepalive-count 9 28: option send-gids true 29: end-volume 30: 31: volume gv2-client-3 32: type protocol/client 33: option ping-timeout 42 34: option remote-host gluster-gu1.xcade.net 35: option remote-subvolume /mnt/arbiter/arbiter_brick 36: option transport-type socket 37: option transport.address-family inet 38: option transport.socket.ssl-enabled off 39: option transport.tcp-user-timeout 0 40: option transport.socket.keepalive-time 20 41: option transport.socket.keepalive-interval 2 42: option transport.socket.keepalive-count 9 43: option send-gids true 44: end-volume 45: 46: volume gv2-replicate-0 47: type cluster/replicate 48: option afr-pending-xattr gv2-client-1,gv2-client-2,gv2-client-3 49: option arbiter-count 1 50: option use-compound-fops off 51: subvolumes gv2-client-1 gv2-client-2 gv2-client-3 52: end-volume 53: 54: volume gv2-dht 55: type cluster/distribute 56: option lock-migration off 57: option force-migration off 58: subvolumes gv2-replicate-0 59: end-volume 60: 61: volume gv2-utime 62: type features/utime 63: option noatime on 64: subvolumes gv2-dht 65: end-volume 66: 67: volume gv2-write-behind 68: type performance/write-behind 69: subvolumes gv2-utime 70: end-volume 71: 72: volume gv2-read-ahead 73: type performance/read-ahead 74: subvolumes gv2-write-behind 75: end-volume 76: 77: volume gv2-readdir-ahead 78: type performance/readdir-ahead 79: option parallel-readdir off 80: option rda-request-size 131072 81: option rda-cache-limit 10MB 82: subvolumes gv2-read-ahead 83: end-volume 84: 85: volume gv2-io-cache 86: type performance/io-cache 87: subvolumes gv2-readdir-ahead 88: end-volume 89: 90: volume gv2-open-behind 91: type performance/open-behind 92: subvolumes gv2-io-cache 93: end-volume 94: 95: volume gv2-quick-read 96: type performance/quick-read 97: subvolumes gv2-open-behind 98: end-volume 99: 100: volume gv2-md-cache 101: type performance/md-cache 102: subvolumes gv2-quick-read 103: end-volume 104: 105: volume gv2 106: type debug/io-stats 107: option log-level INFO 108: option threads 16 109: option latency-measurement off 110: option count-fop-hits off 111: option global-threading off 112: subvolumes gv2-md-cache 113: end-volume 114: 115: volume meta-autoload 116: type meta 117: subvolumes gv2 118: end-volume 119: +------------------------------------------------------------------------------+ [2020-05-12 14:31:22.041236] I [MSGID: 101190] [event-epoll.c:682:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-05-12 14:31:22.041326] I [MSGID: 101190] [event-epoll.c:682:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-05-12 14:31:22.041644] E [MSGID: 114058] [client-handshake.c:1455:client_query_portmap_cbk] 0-gv2-client-2: failed to get the port number for remote subvolume. Please run 'gluster volume status' on server to see if brick process is running. [2020-05-12 14:31:22.041686] I [socket.c:865:__socket_shutdown] 0-gv2-client-2: intentional socket shutdown(12) [2020-05-12 14:31:22.041710] I [rpc-clnt.c:1963:rpc_clnt_reconfig] 0-gv2-client-1: changing port to 49153 (from 0) [2020-05-12 14:31:22.041723] I [MSGID: 114018] [client.c:2347:client_rpc_notify] 0-gv2-client-2: disconnected from gv2-client-2. Client process will keep trying to connect to glusterd until brick's port is available [2020-05-12 14:31:22.041729] I [socket.c:865:__socket_shutdown] 0-gv2-client-1: intentional socket shutdown(11) [2020-05-12 14:31:22.041759] E [MSGID: 108006] [afr-common.c:5360:__afr_handle_child_down_event] 0-gv2-replicate-0: All subvolumes are down. Going offline until at least one of them comes back up. [2020-05-12 14:31:22.041986] E [MSGID: 114058] [client-handshake.c:1455:client_query_portmap_cbk] 0-gv2-client-3: failed to get the port number for remote subvolume. Please run 'gluster volume status' on server to see if brick process is running. [2020-05-12 14:31:22.042026] I [socket.c:865:__socket_shutdown] 0-gv2-client-3: intentional socket shutdown(13) [2020-05-12 14:31:22.042050] I [MSGID: 114018] [client.c:2347:client_rpc_notify] 0-gv2-client-3: disconnected from gv2-client-3. Client process will keep trying to connect to glusterd until brick's port is available [2020-05-12 14:31:22.042069] E [MSGID: 108006] [afr-common.c:5360:__afr_handle_child_down_event] 0-gv2-replicate-0: All subvolumes are down. Going offline until at least one of them comes back up. [2020-05-12 14:31:22.042564] I [MSGID: 114057] [client-handshake.c:1375:select_server_supported_programs] 0-gv2-client-1: Using Program GlusterFS 4.x v1, Num (1298437), Version (400) [2020-05-12 14:31:22.042748] W [dict.c:999:str_to_data] (-->/usr/lib/x86_64-linux-gnu/glusterfs/7.5/xlator/protocol/client.so(+0x354a1) [0x7f33ca9594a1] -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(dict_set_str+0x16) [0x7f33d0349a46] -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(str_to_data+0x60) [0x7f33d0346560] ) 0-dict: value is NULL [Invalid argument] [2020-05-12 14:31:22.042769] I [MSGID: 114006] [client-handshake.c:1236:client_setvolume] 0-gv2-client-1: failed to set process-name in handshake msg [2020-05-12 14:31:22.043551] I [MSGID: 114046] [client-handshake.c:1105:client_setvolume_cbk] 0-gv2-client-1: Connected to gv2-client-1, attached to remote volume '/mnt/gv_gu2/newbrick'. [2020-05-12 14:31:22.043581] I [MSGID: 108005] [afr-common.c:5283:__afr_handle_child_up_event] 0-gv2-replicate-0: Subvolume 'gv2-client-1' came back up; going online. [2020-05-12 14:31:22.044846] I [fuse-bridge.c:5166:fuse_init] 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.24 kernel 7.26 [2020-05-12 14:31:22.044876] I [fuse-bridge.c:5777:fuse_graph_sync] 0-fuse: switched to graph 0 [2020-05-12 14:31:22.045714] E [fuse-bridge.c:5235:fuse_first_lookup] 0-fuse: first lookup on root failed (Transport endpoint is not connected) [2020-05-12 14:31:22.047146] I [MSGID: 0] [afr-inode-write.c:1239:_afr_handle_empty_brick] 0-gv2-replicate-0: New brick is : gv2-client-2 [2020-05-12 14:31:22.048076] I [MSGID: 108039] [afr-inode-write.c:1064:afr_emptyb_set_pending_changelog_cbk] 0-gv2-replicate-0: Set of pending xattr succeeded on gv2-client-1. [2020-05-12 14:31:22.074927] I [fuse-bridge.c:6083:fuse_thread_proc] 0-fuse: initiating unmount of /tmp/mntSKnRMt [2020-05-12 14:31:22.049184] I [MSGID: 108039] [afr-inode-write.c:1064:afr_emptyb_set_pending_changelog_cbk] 0-gv2-replicate-0: Set of pending xattr succeeded on gv2-client-1. [2020-05-12 14:31:22.075142] W [glusterfsd.c:1596:cleanup_and_exit] (-->/lib/x86_64-linux-gnu/libpthread.so.0(+0x76db) [0x7f33cfaba6db] -->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xfd) [0x55b585b6febd] -->/usr/sbin/glusterfs(cleanup_and_exit+0x54) [0x55b585b6fd04] ) 0-: received signum (15), shutting down [2020-05-12 14:31:22.075173] I [fuse-bridge.c:6898:fini] 0-fuse: Unmounting '/tmp/mntSKnRMt'. [2020-05-12 14:31:22.075191] I [fuse-bridge.c:6903:fini] 0-fuse: Closing fuse connection to '/tmp/mntSKnRMt'.