Tendrl / commons

Common code usable by all Tendrl components
http://www.tendrl.org
GNU Lesser General Public License v2.1
4 stars 23 forks source link

tendrl need to support shrinking of gluster cluster while managed #806

Open shtripat opened 6 years ago

shtripat commented 6 years ago

Shrink cluster mechanism currently in tendrl is not very seamless. User needs to un-manage the cluster and then shrink the cluster manually using gluster CLI. After this the cluster needs to be imported.

Ideally tendrl should be able allow shrinking of cluster while its being managed.

shtripat commented 6 years ago

@r0h4n @julienlim @mcarrano @nthomas-redhat @brainfunked @r0h4n add you comments

julienlim commented 6 years ago

@shtripat @r0h4n @julienlim @mcarrano @nthomas-redhat @brainfunked

When a cluster that is under Tendrl management is shrunk, a user should not have to do anything from a Tendrl UI perspective. IMHO, the only thing interaction a user should need to do is to stop, disable, and remove the Tendrl agents, and then remove the node from the Gluster cluster.

Tendrl should automatically detect the change in the cluster state and automatically update into etcd, remove any related dashboard entries in Grafana, and somehow deal with its telemetry data -- whether it's removed or put in some kind of "stasis" like state so it's not orphaned and can be brought back and associated to a new node (if it makes sense) in the future. If there is no clean way to remove the dashboard entries and telemetry data in Grafana, then we should provide well-documented procedures on how to do so, along with steps to backup/restore in case the user runs into problems.

Another potential option is to add a Remove Host option from Tendrl UI, but that gets quite complicated since we'll then have deal with data migration as well as removing Tendrl and Gluster from the node. This is probably more than what we want to take on at the moment.

shtripat commented 6 years ago

@julienlim yes even I agree with the first option of auto detecting the peer remove done from CLI and seamlessly take care of all its etcd and graphite data removal etc.