gluster / glusterdocs

This repo contains the source of official Gluster documentation rendered at https://docs.gluster.org
MIT License
353 stars 279 forks source link

Thin-Arbiter-Volumes.md has inaccurate and incomplete information. #621

Open alphabet5 opened 3 years ago

alphabet5 commented 3 years ago
amarts commented 3 years ago

@Sheetalpamecha @aspandey @itisravi @karthik-us can you please have a look when you get a chance... thanks.

alphabet5 commented 3 years ago

After some more digging, there is this: https://review.gluster.org/#/c/glusterfs/+/20056/

Which has a script to assist with configuring a service for the thin-arbiter process, as well as a template .vol file at glusterfs/extras/thin-arbiter/thin-arbiter/thin-arbiter.vol

It appears that you can't run a thin-arbiter on a node that is also running glusterd by default. (I was trying to test out 1 node from cluster2 being a thin-arbiter for cluster1)

A couple of things that I can't seem to find:

It looks like the arbiter for 8.3 is not the same op-version as glusterfs?

# gluster peer probe arbiter
peer probe: failed: Peer arbiter does not support required op-version
itisravi commented 3 years ago

@alphabet5

alphabet5 commented 3 years ago

Thanks @itisravi. Is there a way to verify the status of the arbiter? If the arbiter is online, but unreachable from the cluster, how would I know?

I don't really want to take a brick offline to see if the arbiter still allows writes to the other brick. Is there another way to verify the arbiter status?

amarts commented 3 years ago

telnet <thin-arbiter-node> 24007

Ctl-]

alphabet5 commented 3 years ago

@amarts how does this verify that the arbiter is working?

# telnet arbiter 24007
Trying 192.168.1.254...
Connected to arbiter.
Escape character is '^]'.
^]

If I look at logs for the arbiter, it seems as though it might not be working, and I don't see how telnetting to the arbiter verifies its operational status.

[2020-12-15 15:22:31.615878] E [MSGID: 115001] [server-handshake.c:584:server_setvolume] 0-ta-server: Cannot authenticate client from CTX_ID:fe5e65be-0254-4e46-8a5c-fe7b8e453459-GRAPH_ID:0-PID:1495-HOST:server2-PC_NAME:gvolume0-ta-2-RECON_NO:-75028 8.3 because brick is not attached in graph [No such file or directory]

Even if you verify the service status:

root@arbiter:~# systemctl status thin-arbiter                                 ● thin-arbiter.service - GlusterFS, Thin-arbiter process to maintain quorum f>
     Loaded: loaded (/etc/systemd/system/thin-arbiter.service; enabled; vendo>
     Active: active (running) since Mon 2020-12-14 18:26:05 UTC; 20h ago
   Main PID: 9872 (glusterfsd)
     Memory: 1.0G
     CGroup: /system.slice/thin-arbiter.service
             └─9872 /usr/sbin/glusterfsd -N --volfile-id ta -f /mnt/brick1/gv>

Dec 14 18:26:05 arbiter systemd[1]: Started GlusterFS, Thin-arbiter process t>
lines 1-9/9 (END)

It doesn't validate that the thin-arbiter is operational.

alphabet5 commented 3 years ago

To clarify; I'm thinking all of this information would be useful to have in Thin-Arbiter-Volumes.md.

If you want me to take a stab at a pull request, let me know.

I also haven't found an example for using setup-thin-arbiter.sh yet. I'm guessing something like cd /mnt/dir/thin-arbiter-dir && sudo /?/?/?/setup-thin-arbiter.sh

itisravi commented 3 years ago

If the arbiter is online, but unreachable from the cluster, how would I know

It needs to be reachable only from the (fuse) clients and not the cluster. So if it is not connected to any of the bricks including the TA brick, the fuse mount logs will have messages like disconnected from distrep-client-0etc. Conversely upon an established connection , you will see Connected to distrep-client-0 etc. in the logs.

If you want me to take a stab at a pull request, let me know.

Sure go ahead.

I also haven't found an example for using setup-thin-arbiter.sh yet

Slide 23 of https://archive.fosdem.org/2020/schedule/event/sds_gluster_thin_arbiter/attachments/slides/4110/export/events/attachments/sds_gluster_thin_arbiter/slides/4110/gluster_thin_arbiter_fosdem_2020.pdf has an embedded demo, check it out!

alphabet5 commented 3 years ago

Is it possible to remove a thin-arbiter brick?

gluster volume remove-brick gvolume0 replica 2 thin-arbiter 1 arbiter:/mnt/brick1/gvolume0/thin-arbiter.vol force
wrong brick type: thin-arbiter, use <HOSTNAME>:<export-dir-abs-path>

Usage:
volume remove-brick <VOLNAME> [replica <COUNT>] <BRICK> ... <start|stop|status|commit|force>
alphabet5 commented 3 years ago

Per that slide deck, support for add/replace brick are on the todo list yet.

TODO

itisravi commented 3 years ago

on the todo list yet.

Yes @Sheetalpamecha is working on this via https://github.com/gluster/glusterfs/issues/1528

polachz-nxp commented 1 year ago

I have fought last night with Thin Arbiter too. the MD file doesn't give ANY information about VOLUME_FILE, how to get this or create. Command to create volume with thin arbiter is still inaccurate.

And Even If I did my best to configure Thin-Arbiter correctly, I have no idea how verify that it works or not. And because here is no way how to reconfigure volume (https://github.com/gluster/glusterfs/issues/1528 is dead for now) then any fix in the future means get all data out form the volume and re-create it.

I think that this part of documentation needs significant improvements...

polachz commented 1 year ago

Finally I found a way how-to make GlusterFS Thin Arbiter up and running, Here is my How-To

https://polach.me/posts/howto-setup-glusterfs-thin-arbiter-at-homelab/

Maybe it can save some time to others...