Auto discovery without extra service like SSH

disaster37 commented 8 years ago

Hi,

I have writen new glusterfs image dedicated to work on Rancher. It's on base of your work.

You can try it on https://hub.docker.com/r/webcenter/rancher-glusterfs-server, it's maybee better ?

I use the DNS discovery service and the api discovery service included on Rancher to create cluster and auto extended them.

Il will document all ASAP.

nixelsolutions commented 8 years ago

Good job @disaster37 !! have you made any change in the clustering logics in init.py script? I think you've copied the same I did but in phyton, but the clustering mechanism (peering, joining and so) seems the same right?

I ask this because there is a problem with the dig | sort implementation: when a new container in the cluster is added, if the new container has an IP that will be sorted in the first position, the new container will think it's the primary node and will start a new cluster and never join the existing one. Because of this sometimes scaling will not work.

I think a good solution would be to use the metadata service and make this call:

http://rancher-metadata/latest/self/service so we can make the container with ID=[STACK]_[SERVICE_NAME]_1 the first one to initialize the cluster

Also, maybe it's better to use this on line 79 in your init.py script (change the IP for rancher-metadata alias): my_ip = urllib2.urlopen('http://rancher-metadata/latest/self/container/primary_ip').read()

What you think?

Thanks!

disaster37 commented 8 years ago

It's not the same logic of you. I suppose that dig command return IP in random order.

Each container do the same logic :

container get all IP that compose the current service via 'dig SERVICE_NAME +short'
container get is rancher IP via http://169.254.169.250/latest/self/container/primary_ip (you are right I will change with http://rancher-metadata/latest/self/container/primary_ip)
container look if it already on cluster

If it already on cluster : 4.1 it compare the number of peer in cluster with the number of IP in service. 4.2 if the number is same, it do nothing 4.3 if the number is not the same, it control that the number of potential peer is compatible with the replica plicy (number of peer modulo number of replica) 4.4. if not compatible, do nothing 4.5 if compatible, it guest the new peer on gluster and extend all volume with them

if it not yet on cluster 5.1 it check if guster already exist is remote host (ie gluster --remote-host=IP peer status) 5.2 if cluster already exist, it stay that e member guest it 5.3 if no cluster exist yet, it can look if it the master (the master have the most little IP) 5.4 if it the slave, it stay that master guest it 5.5 if it the master, it guest all IP on the new gluster and create each volume with all peer

This new image support the creation of many volume and you can use quota and stripe.

disaster37 commented 8 years ago

each container doing the same logic each 5 minutes.

nixelsolutions commented 8 years ago

Thanks, will test it. Hope you don't mind I copy you ;)

BTW: you have implemented the improvements I've been pending to do, thank you for that!

nixelsolutions / rancher-glusterfs-server

Auto discovery without extra service like SSH #4