gluster / glusterfs

Gluster Filesystem : Build your distributed storage in minutes
https://www.gluster.org
GNU General Public License v2.0
4.53k stars 1.07k forks source link

Gluster 11.1 doesn't appear to be updating port-map to clients, causing a myriad of issues #4274

Closed edrock200 closed 4 months ago

edrock200 commented 5 months ago

Description of problem: Gluster 11.1 appears to not be updating portmapper. So if a brick crashes and is restarted, nothing seems to connect to it.

The exact command to reproduce the issue: kill a brick, then restart glusterd service to bring it back online.

The full output of the command that failed: Not a command, but brick logs will show entries like this: [2023-12-04 21:18:20.889997 +0000] W [socket.c:754:__socket_rwv] 0-tcp.media-server: readv on ipaddress:48253 failed (No data available)

But a gluster volume status doesn't show that port being used for any brick.

I suspect this is causing a slew of side effects I'm seeing such as stuck heals, gfid's not translating to paths in gluster volume heal volume info.

Clients show entries like:

[2023-12-04 21:34:46.516397 +0000] W [MSGID: 122040] [ec-common.c:1262:ec_prepare_update_cbk] 0-media-disperse-30: Failed to get size and version : FOP : 'XATTROP' failed on '/path/to/file' with gfid 5dda8738-87d1-46b1-85cc-28b8bdc8e730. Parent FOP: SETATTR [Input/output error]

[2023-12-04 21:34:46.516542 +0000] E [MSGID: 109031] [dht-linkfile.c:212:dht_linkfile_setattr_cbk] 0-media-dht: failed to perform setattr [{path='/path/to/file'}, {gfid=00000000-0000-0000-0000-000000000000}, {errno=5}, {error=Input/output error}] [2023-12-04 21:34:46.516591 +0000] E [MSGID: 122034] [ec-common.c:662:ec_log_insufficient_vol] 0-media-disperse-30: Insufficient available children for this request: Have : 0, Need : 4 : Child UP : 11111 Mask: 00000, Healing : 00000 : FOP : 'XATTROP' failed on [2023-12-05 00:04:06.685605 +0000] D [MSGID: 0] [client-rpc-fops_v2.c:565:client4_0_rmdir_cbk] 0-stack-trace: stack-address: 0x560de4334018, media-client-108 returned -1 [Directory not empty] [2023-12-05 00:04:06.685617 +0000] E [MSGID: 122034] [ec-common.c:662:ec_log_insufficient_vol] 0-media-disperse-4: Insufficient available children for this request: Have : 0, Need : 4 : Child UP : 11111 Mask: 00000, Healing : 00000 : FOP : 'XATTROP' failed on '/temp/trash2' with gfid a26829d1-2804-41a1-a5ca-212c2521122c. Parent FOP: RMDIR
[2023-12-05 00:04:06.685629 +0000] E [MSGID: 122037] [ec-common.c:2348:ec_update_size_version_done] 0-media-disperse-4: Failed to update version and size. FOP : 'XATTROP' failed on '/temp/trash2' with gfid a26829d1-2804-41a1-a5ca-212c2521122c. Parent FOP: RMDIR [Input/output error] [2023-12-04 20:56:01.160988 +0000] W [MSGID: 122053] [ec-common.c:331:ec_check_status] 0-media-disperse-11: Operation failed on 1 of 5 subvolumes.(up=11111, mask=11110, remaining=00000, good=11110, bad=00001,(Least significant bit represents first client/brick of subvol), FOP : 'OPENDIR' failed on "/path/to/dir' with gfid c9edbaa5-18b7-4aa2-8923-c559ea11074e. Parent FOP: No Parent) [2023-12-05 00:03:18.140051 +0000] D [MSGID: 0] [dht-common.c:1480:dht_lookup_dir_cbk] 0-media-dht: /temp/trash2/test2.1.1.1: mds xattr trusted.glusterfs.dht.mds is not present on media-disperse-5(gfid = c4050882-410d-4fab-85a7-4c15716b1983) [2023-12-05 00:03:18.140157 +0000] D [MSGID: 0] [dht-common.c:1384:dht_lookup_dir_cbk] 0-media-dht: /temp/trash2/test2.1.1.1: lookup on media-disperse-6 returned with op_ret = 0, op_errno = 0 [2023-12-05 00:03:18.140177 +0000] D [MSGID: 0] [dht-common.c:1480:dht_lookup_dir_cbk] 0-media-dht: /temp/trash2/test2.1.1.1: mds xattr trusted.glusterfs.dht.mds is not present on media-disperse-6(gfid = c4050882-410d-4fab-85a7-4c15716b1983) [2023-12-05 00:03:18.140462 +0000] D [MSGID: 0] [dht-common.c:1384:dht_lookup_dir_cbk] 0-media-dht: /temp/trash2/test2.1.1.1: lookup on media-disperse-2 returned with op_ret = 0, op_errno = 0 [2023-12-05 00:03:18.140482 +0000] D [MSGID: 0] [dht-common.c:1480:dht_lookup_dir_cbk] 0-media-dht: /temp/trash2/test2.1.1.1: mds xattr trusted.glusterfs.dht.mds is not present on media-disperse-2(gfid = c4050882-410d-4fab-85a7-4c15716b1983) [2023-12-05 00:03:18.140583 +0000] D [MSGID: 0] [dht-common.c:1384:dht_lookup_dir_cbk] 0-media-dht: /temp/trash2/test2.1.1.1: lookup on media-disperse-16 returned with op_ret = 0, op_errno = 0 [2023-12-05 00:03:18.140630 +0000] D [MSGID: 0] [dht-common.c:1384:dht_lookup_dir_cbk] 0-media-dht: /temp/trash2/test2.1.1.1: lookup on media-disperse-1 returned with op_ret = 0, op_errno = 0 [2023-12-05 00:03:18.140650 +0000] D [MSGID: 0] [dht-common.c:1480:dht_lookup_dir_cbk] 0-media-dht: /temp/trash2/test2.1.1.1: mds xattr trusted.glusterfs.dht.mds is not present on media-disperse-1(gfid = c4050882-410d-4fab-85a7-4c15716b1983) With loglevel debug on client mount, some additional entries: [2023-12-05 00:04:06.683340 +0000] W [MSGID: 114031] [client-rpc-fops_v2.c:560:client4_0_rmdir_cbk] 0-media-client-23: remote operation failed. [{errno=39}, {error=Directory not empty}] [2023-12-05 00:04:06.683379 +0000] D [MSGID: 0] [client-rpc-fops_v2.c:565:client4_0_rmdir_cbk] 0-stack-trace: stack-address: 0x560de40c0018, media-client-23 returned -1 [Directory not empty] [2023-12-05 00:04:06.683416 +0000] D [MSGID: 0] [ec-combine.c:913:ec_combine_check] 0-media-disperse-4: Mismatching return code in answers of 'RMDIR': -1 <-> 0 [2023-12-05 00:04:06.685605 +0000] D [MSGID: 0] [client-rpc-fops_v2.c:565:client4_0_rmdir_cbk] 0-stack-trace: stack-address: 0x560de4334018, media-client-108 returned -1 [Directory not empty] [2023-12-05 00:04:06.685617 +0000] E [MSGID: 122034] [ec-common.c:662:ec_log_insufficient_vol] 0-media-disperse-4: Insufficient available children for this request: Have : 0, Need : 4 : Child UP : 11111 Mask: 00000, Healing : 00000 : FOP : 'XATTROP' failed on '/temp/trash2' with gfid a26829d1-2804-41a1-a5ca-212c2521122c. Parent FOP: RMDIR
[2023-12-05 00:04:06.685629 +0000] E [MSGID: 122037] [ec-common.c:2348:ec_update_size_version_done] 0-media-disperse-4: Failed to update version and size. FOP : 'XATTROP' failed on '/temp/trash2' with gfid a26829d1-2804-41a1-a5ca-212c2521122c. Parent FOP: RMDIR [Input/output error] [2023-12-05 00:04:06.685955 +0000] D [MSGID: 0] [client-rpc-fops_v2.c:565:client4_0_rmdir_cbk] 0-stack-trace: stack-address: 0x560de434a818, media-client-134 returned -1 [Directory not empty] [2023-12-05 00:04:06.685953 +0000] D [dict.c:1054:data_to_uint32] (-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(dict_foreach_match+0x7b) [0x7ff704b4877b] -->/usr/lib/x86_64-linux-gnu/glusterfs/11.1/xlator/cluster/disperse.so(+0x36478) [0x7ff6fcdd7478] -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(data_to_uint32+0x15e) [0x7ff704b47f1e] ) 0-dict: key null, unsigned integer type asked, has integer type [Invalid argument]

Expected results: All clients and servers can see all bricks and port updates automatically

Mandatory info: - The output of the gluster volume info command: mediainfo (2).txt

- The output of the gluster volume status command: mediastatus.txt

- The output of the gluster volume heal command:

mediaheal.txt

*- Provide logs present on following locations of client and server nodes - /var/log/glusterfs/ Link has a tar of all logs. We have a lot of clients so I attached a sampling of 3 clients. mnt-gluster or mnt-unionfs.log is the volume in question. You'll notice client "client3" has a log that is a 22GB in size, that is how many errors its generating, and this is in less than 1 week.

https://drive.google.com/file/d/13Sp02QAOsb5_YmxEL1v0BS_2Ixc2U2pY/view?usp=sharing

**- Is there any crash ? Provide the backtrace and coredump

Additional info: The only solution presently appears to be to completely stop the volume, then start the volume, which is obviously not ideal. io_uring was turned on recently, have not yet tried to turn it off to see if it could be a cause. Cluster was recently updated to 11.1 and op version to 110000

- The operating system / glusterfs version: some nodes 18.04, some 22.04 Ubuntu. All nodes gluster 11.1

Note: Please hide any confidential data which you don't want to share in public like IP address, file name, hostname or any other configuration

edrock200 commented 5 months ago

Disregard. After digging in further today it appears the ubuntu firewall had a corrupt rule set. Resetting this on one node in the affected subvolume instantly cleared up just about every error listed above. Now running a fix-layout and full volume heal.

edrock200 commented 5 months ago

I guess I spoke to soon. Although I've resolved the errors, it appears that gluster is still not updating ports if a brick is restarted. If I look at the brick log located in /var/lib/glusterd/vols/volname/bricks All hosts/bricks show this for listen port: listen-port=0 Is this accurate or a bug? In addition, all bricks, except 1, show brick-fsid=0 except for one which shows brick-fsid=66309

Brick l logs are flooded with this: 0-tcp.ssd-server: readv on ip:port failed (No data available) In addition, the port listed is not the actual listening port.

edrock200 commented 5 months ago

Just to add some additional info, in looking at the client logs, the peer vol update broadcast to clients is missing one node, which just so happens to be the first node of the subvolume giving me grief.

However, if I do gluster peer status or gluster pool list from all san nodes including the one missing, all nodes show connected.

edrock200 commented 5 months ago

Closing again. Force killed glusterfsd on the node the clients were pointed too and relaunched. Port map updates seem to work properly again. The odd part is yesterday, I stopped the entire volume and started it, and it didn't resolve. I thought stopping the volume stops glusterfsd processes as well, but maybe not.

edrock200 commented 4 months ago

Closing