Closed Webgardener closed 6 years ago
Using heketi-cli you must first "remove" the node or devices. For example, run heketi-cli node disable NODEID
followed byheketi-cli node remove NODEID
. This will remove all the bricks from the given node. Once the devices are emptied you can then use heketi-cli device delete
and heketi-cli node delete
to fully remove the items. Note that in order to remove the node there must be sufficent space on the other node's devices. If there is not, you need to add a new good node before running node remove.
As for the rest of your steps, I don't fully understand the need to generate a new topology file and re-run gluster kubernetes. I would simply add nodes and devices with heketi-cli (use node add
and device add
respectively).
Thank you for your answer. I had already tried to remove the node but it did not work, the bricks cannot be removed. There are plenty of space on the other nodes' devices though. There is something I must be missing.
About the rest of the steps, the cluster is composed of three nodes. When a node is lost physically, I would like to replace it with another node (same hostname, different IP).
Thank you for your answer. I had already tried to remove the node but it did not work, the bricks cannot be removed. There are plenty of space on the other nodes' devices though. There is something I must be missing.
In that case you might just be hitting a bug or some other subtle issue rather than a architectureal problem. Please feel free to post the exact error message you got and any relevant logging from heketi and we will see what we can do about helping you debug this issue.
About the rest of the steps, the cluster is composed of three nodes. When a node is lost physically, I would like to replace it with another node (same hostname, different IP).
OK. In that case I will reiterate that IMO the right approach is to use the heketi-cli to add the new replacement node and devices. Don't bother updating the topology... its only real use is loading the initial cluster. To get the glusterfs pod running on the node simply add the appropriate labels to the node and the daemonset should do the rest.
Ok. Here are some config info and error message.
Before a node crashes:
$ openstack server list | grep gluster
gluster-1 | ACTIVE | EXTENDED=172.50.0.194; IP=192.168.0.221
gluster-2 | ACTIVE | EXTEND=172.50.0.187; IP=192.168.0.225
gluster-3 | ACTIVE | EXTENDED=172.50.0.195; IP=192.168.0.222
$ openstack volume list | grep gluster
gluster-1-volume-docker | in use | 20 | Attached to gluster-2 on /dev/sdc
gluster-1-volume-glusterfs | in-use | 30 | Attached to gluster-2 on /dev/sdb
gluster-2-volume-glusterfs | in-use | 30 | Attached to gluster-2 on /dev/sdb
gluster-2-volume-docker | in-use | 20 | Attached to gluster-2 on /dev/sdc
gluster-3-volume-glusterfs | in-use | 30 | Attached to gluster-3 on /dev/sdb
gluster-3-volume-docker | in-use | 20 | Attached to gluster-3 on /dev/sdc
Storage Class:
apiVersion: v1
items:
- apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"storage.k8s.io/v1","kind":"StorageClass","metadata":{"annotations":{},"name":"storage-test-gluster"},"parameters":{"resturl":"http://172.20.32.30:8080","volumetype":"replicate:3"},"provisioner":"kubernetes.io/glusterfs"}
creationTimestamp: 2018-10-19T12:42:27Z
name: storage-test-gluster
resourceVersion: "1563397"
selfLink: /apis/storage.k8s.io/v1/storageclasses/storage-test-gluster
uid: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
parameters:
resturl: http://172.20.32.30:8080
volumetype: replicate:3
provisioner: kubernetes.io/glusterfs
reclaimPolicy: Delete
volumeBindingMode: Immediate
kind: List
metadata:
resourceVersion: ""
selfLink: ""
Info about K8s cluster state:
$ kubectl -n kube-system get no,ds,po
node/gluster-1 Ready,SchedulingDisabled
node/gluster-2 Ready,SchedulingDisabled
node/gluster-3 Ready,SchedulingDisabled
daemonset.extensions/glusterfs 3 3 3 3 3 storagenode=glusterfs 19m
pod/glusterfs-7dtw6 1/1 Running
pod/glusterfs-hdh6z 1/1 Running
pod/glusterfs-s2mj4 1/1 Running
pod/heketi-5bd5d976b6-9tcqb 1/1 Running
$ heketi-cli cluster list
Clusters:
Id:fd2c0e1c86a459f5498e489bd226fe5d [file][block]
$ heketi-cli cluster info fd2c0e1c86a459f5498e489bd226fe5d
Nodes:
0c0be6bac1705c0685dd016cb843f261
5a1b616651842fccb7d8a30e10c51423
6b970fb24c73d9eea3bbef4c0dfa9a21
Volumes:
18c5483a18baa26b248e68ba07c248aa
e46ba70f01f01678cc7a576ccf14281a
Block: true
$ heketi-cli node info 0c0be6bac1705c0685dd016cb843f261
Node Id: 0c0be6bac1705c0685dd016cb843f261
State: online
Cluster Id: fd2c0e1c86a459f5498e489bd226fe5d
Zone: 2
Management Hostname: gluster-1
Storage Hostname: 172.50.0.194
Devices:
Id:d68c6b1a6a128c35251d214501a8e84d Name:/dev/cinder/gluster State:online Size (GiB):29 Used (GiB):3 Free (GiB):26 Bricks:2
$ heketi-cli volume list
Id:18c5483a18baa26b248e68ba07c248aa Cluster:fd2c0e1c86a459f5498e489bd226fe5d Name:heketidbstorage
Id:e46ba70f01f01678cc7a576ccf14281a Cluster:fd2c0e1c86a459f5498e489bd226fe5d Name:vol_e46ba70f01f01678cc7a576ccf14281a
$ heketi-cli volume info e46ba70f01f01678cc7a576ccf14281a
Name: vol_e46ba70f01f01678cc7a576ccf14281a
Size: 1
Volume Id: e46ba70f01f01678cc7a576ccf14281a
Cluster Id: fd2c0e1c86a459f5498e489bd226fe5d
Mount: 172.50.0.194:vol_e46ba70f01f01678cc7a576ccf14281a
Mount Options: backup-volfile-servers=172.50.0.187,172.50.0.195
Block: false
Free Size: 0
Reserved Size: 0
Block Hosting Restriction: (none)
Block Volumes: []
Durability Type: replicate
Distributed+Replica: 3
Now, a disaster occurs, server gluster 1 and its attached volumes are gone for good.
$ openstack server list | grep gluster
gluster-2 | ACTIVE | EXTEND=172.50.0.187; IP=192.168.0.225
gluster-3 | ACTIVE | EXTENDED=172.50.0.195; IP=192.168.0.222
$ openstack volume list | grep gluster
gluster-2-volume-glusterfs | in-use | 30 | Attached to gluster-2 on /dev/sdb
gluster-2-volume-docker | in-use | 20 | Attached to gluster-2 on /dev/sdc
gluster-3-volume-glusterfs | in-use | 30 | Attached to gluster-3 on /dev/sdb
gluster-3-volume-docker | in-use | 20 | Attached to gluster-3 on /dev/sdc
$ kubectl -n kube-system get no,ds,po | grep gluster
node/gluster-1 NotReady,SchedulingDisabled
node/gluster-2 Ready,SchedulingDisabled
node/gluster-3 Ready,SchedulingDisabled
daemonset.extensions/glusterfs 3 3 2 3 2 storagenode=glusterfs
pod/glusterfs-7dtw6 1/1 Running
pod/glusterfs-hdh6z 1/1 Running
pod/glusterfs-s2mj4 1/1 NodeLost
Heketi still thinks dead node is online :
$ heketi-cli node info 0c0be6bac1705c0685dd016cb843f261
Node Id: 0c0be6bac1705c0685dd016cb843f261
State: online
Cluster Id: fd2c0e1c86a459f5498e489bd226fe5d
Zone: 2
Management Hostname: gluster-1
Storage Hostname: 172.50.0.194
Devices:
Id:d68c6b1a6a128c35251d214501a8e84d Name:/dev/cinder/gluster State:online Size (GiB):29 Used (GiB):3 Free (GiB):26 Bricks:2
So now I want to repair the cluster by, first, removing the dead node with heketi-cli:
$ heketi-cli node disable 0c0be6bac1705c0685dd016cb843f261
Node 0c0be6bac1705c0685dd016cb843f261 is now offline
$ heketi-cli node remove 0c0be6bac1705c0685dd016cb843f261
**Error: Failed to remove device, error: No Replacement was found for resource requested to be removed
command terminated with exit code 255**
$ heketi-cli topology info
File: true
Block: true
Volumes:
Name: heketidbstorage
Size: 2
Id: 18c5483a18baa26b248e68ba07c248aa
Cluster Id: fd2c0e1c86a459f5498e489bd226fe5d
Mount: 172.50.0.194:heketidbstorage
Mount Options: backup-volfile-servers=172.50.0.187,172.50.0.195
Durability Type: replicate
Replica: 3
Snapshot: Disabled
Bricks:
Id: 62245415b93b84072d78e28b77bc0a05
Path: /var/lib/heketi/mounts/vg_d68c6b1a6a128c35251d214501a8e84d/brick_62245415b93b84072d78e28b77bc0a05/brick
Size (GiB): 2
Node: 0c0be6bac1705c0685dd016cb843f261
Device: d68c6b1a6a128c35251d214501a8e84d
Id: 9bf29464f080b2b06c0e31cf86037d53
Path: /var/lib/heketi/mounts/vg_c3ea2fc610caed5d8372f4930ede201f/brick_9bf29464f080b2b06c0e31cf86037d53/brick
Size (GiB): 2
Node: 6b970fb24c73d9eea3bbef4c0dfa9a21
Device: c3ea2fc610caed5d8372f4930ede201f
Id: ffc14c4ec54e696d52cd194f30438870
Path: /var/lib/heketi/mounts/vg_f8876f11a1a6e8043ea082339b6dd2df/brick_ffc14c4ec54e696d52cd194f30438870/brick
Size (GiB): 2
Node: 5a1b616651842fccb7d8a30e10c51423
Device: f8876f11a1a6e8043ea082339b6dd2df
Name: vol_e46ba70f01f01678cc7a576ccf14281a
Size: 1
Id: e46ba70f01f01678cc7a576ccf14281a
Cluster Id: fd2c0e1c86a459f5498e489bd226fe5d
Mount: 172.50.0.194:vol_e46ba70f01f01678cc7a576ccf14281a
Mount Options: backup-volfile-servers=172.50.0.187,172.50.0.195
Durability Type: replicate
Replica: 3
Snapshot: Disabled
Bricks:
Id: 6a311aef7b3d43256998f64df8c3fe9a
Path: /var/lib/heketi/mounts/vg_f8876f11a1a6e8043ea082339b6dd2df/brick_6a311aef7b3d43256998f64df8c3fe9a/brick
Size (GiB): 1
Node: 5a1b616651842fccb7d8a30e10c51423
Device: f8876f11a1a6e8043ea082339b6dd2df
Id: d572b10a7533bce138675fe664b50ebc
Path: /var/lib/heketi/mounts/vg_c3ea2fc610caed5d8372f4930ede201f/brick_d572b10a7533bce138675fe664b50ebc/brick
Size (GiB): 1
Node: 6b970fb24c73d9eea3bbef4c0dfa9a21
Device: c3ea2fc610caed5d8372f4930ede201f
Id: f3ceded3920a2666fb5c82e8e629dd27
Path: /var/lib/heketi/mounts/vg_d68c6b1a6a128c35251d214501a8e84d/brick_f3ceded3920a2666fb5c82e8e629dd27/brick
Size (GiB): 1
Node: 0c0be6bac1705c0685dd016cb843f261
Device: d68c6b1a6a128c35251d214501a8e84d
Nodes:
Node Id: 0c0be6bac1705c0685dd016cb843f261
State: offline
Cluster Id: fd2c0e1c86a459f5498e489bd226fe5d
Zone: 2
Management Hostnames: gluster-1
Storage Hostnames: 172.50.0.194
Devices:
Id:d68c6b1a6a128c35251d214501a8e84d Name:/dev/cinder/gluster State:online Size (GiB):29 Used (GiB):3 Free (GiB):26
Bricks:
Id:62245415b93b84072d78e28b77bc0a05 Size (GiB):2 Path: /var/lib/heketi/mounts/vg_d68c6b1a6a128c35251d214501a8e84d/brick_62245415b93b84072d78e28b77bc0a05/brick
Id:f3ceded3920a2666fb5c82e8e629dd27 Size (GiB):1 Path: /var/lib/heketi/mounts/vg_d68c6b1a6a128c35251d214501a8e84d/brick_f3ceded3920a2666fb5c82e8e629dd27/brick
Node Id: 5a1b616651842fccb7d8a30e10c51423
State: online
Cluster Id: fd2c0e1c86a459f5498e489bd226fe5d
Zone: 1
Management Hostnames: gluster-2
Storage Hostnames: 172.50.0.187
Devices:
Id:f8876f11a1a6e8043ea082339b6dd2df Name:/dev/cinder/gluster State:online Size (GiB):29 Used (GiB):3 Free (GiB):26
Bricks:
Id:6a311aef7b3d43256998f64df8c3fe9a Size (GiB):1 Path: /var/lib/heketi/mounts/vg_f8876f11a1a6e8043ea082339b6dd2df/brick_6a311aef7b3d43256998f64df8c3fe9a/brick
Id:ffc14c4ec54e696d52cd194f30438870 Size (GiB):2 Path: /var/lib/heketi/mounts/vg_f8876f11a1a6e8043ea082339b6dd2df/brick_ffc14c4ec54e696d52cd194f30438870/brick
Node Id: 6b970fb24c73d9eea3bbef4c0dfa9a21
State: online
Cluster Id: fd2c0e1c86a459f5498e489bd226fe5d
Zone: 2
Management Hostnames: gluster-3
Storage Hostnames: 172.50.0.195
Devices:
Id:c3ea2fc610caed5d8372f4930ede201f Name:/dev/cinder/gluster State:online Size (GiB):29 Used (GiB):3 Free (GiB):26
Bricks:
Id:9bf29464f080b2b06c0e31cf86037d53 Size (GiB):2 Path: /var/lib/heketi/mounts/vg_c3ea2fc610caed5d8372f4930ede201f/brick_9bf29464f080b2b06c0e31cf86037d53/brick
Id:d572b10a7533bce138675fe664b50ebc Size (GiB):1 Path: /var/lib/heketi/mounts/vg_c3ea2fc610caed5d8372f4930ede201f/brick_d572b10a7533bce138675fe664b50ebc/brick
I tried the same scenario with 4 nodes and a replica 3 and it worked:
heketi-cli node remove b036c5b98773e0309a8ce94925b46228
Node b036c5b98773e0309a8ce94925b46228 is now removed
But still, this is not a viable solution: a entire zone can also disapear (with 2 nodes on it)... is there a workaround?
I still struggle to add a new node.
In case of replica set to 3 and 4 nodes available, deletion of dead node is now OK:
$ heketi-cli node disable b036c5b98773e0309a8ce94925b46228
Node b036c5b98773e0309a8ce94925b46228 is now disabled
$ heketi-cli node remove b036c5b98773e0309a8ce94925b46228
Node b036c5b98773e0309a8ce94925b46228 is now removed
$ heketi-cli device delete 1706deeec069c88fa959fbded7ade4ca
Device 1706deeec069c88fa959fbded7ade4ca deleted
$ heketi-cli node delete b036c5b98773e0309a8ce94925b46228
Node b036c5b98773e0309a8ce94925b46228 deleted
$ heketi-cli node list
Id:247159a12305c992eb0b042a23344dce Cluster:96f3890df2950b405a8dcf9b7c064ed9
Id:d4d46f2d1eb20bc64f5361bd1acd8b4e Cluster:96f3890df2950b405a8dcf9b7c064ed9
Id:e67edfce50189dfac1968ad64608b2c1 Cluster:96f3890df2950b405a8dcf9b7c064ed9
Since replica was set to 3, the Gluster cluster still works properly. But I cannot loose another node! So I need to add a worker node to my GlusterFS cluster asap.
The strategy is to add a new host (named gluster-2 like the previous one), and configure it so that it can join the GlusterFS cluster .
But as soon as I add the new host with same hostname as the dead one (gluster-2) - before any further configuration - some of the gluster pods enter in CrashLoopBackOff.
pod/glusterfs-4jzcw 0/1 CrashLoopBackOff 0 2d
pod/glusterfs-hlg2c 1/1 Running 1 2d
pod/glusterfs-kfb8k 1/1 NodeLost 0 2d
pod/glusterfs-z2r5s 0/1 CrashLoopBackOff 0 2d
Error: Failed to remove device, error: No Replacement was found for resource requested to be removed In case of replica set to 3 and 4 nodes available, deletion of dead node is now OK:
RIght. So when I wrote, "Note that in order to remove the node there must be sufficent space on the other node's devices. If there is not, you need to add a new good node before running node remove." I should have expanded on that more. Replica 3 volumes require that each brick of the replica set be on 3 different nodes. If you only have a 3 node cluster you will not be able to replace any node in that cluster until you add in a replacement node first. A four node cluster is acting akin to a hot spare in this scenario.
The strategy is to add a new host (named gluster-2 like the previous one), and configure it so that it can join the GlusterFS cluster .
I don't think that will work. You need to name it something new like "gluster-5" with it's own unique IPs. gluster (and heketi) use the hosts networking and thus you can't reuse any existing identifiers until the old node is completely purged from the cluster first.
But still, this is not a viable solution: a entire zone can also disapear (with 2 nodes on it)... is there a workaround?
If you have more than one node in a single zone, then I can't think of one that uses the tools w/in heketi. Problem is even if you add >1 node per zone heketi's algorithm for placing bricks does not respect zone strongly enough to guarantee that it won't place two of our replica 3 bricks in the same zone (we're aware of this issue but it's difficult to fix directly). You'd have to have exactly three zones with one gluster node each. I'm ignoring any potential gluster solutions that are not supported by heketi, because I am primarily familiar with heketi code itself. That doesn't mean you shouldn't read up on possibilities like backup/georeplication/etc.
Thank you very much for your help. The issue can be closed.
Hi,
After loosing one (or more) physical gluster node, I tried to repair the cluster with following steps:
I am stuck at step one since it is impossible to delete a node - with heketi-cli - that contains devices. And impossible to delete devices that contain bricks.
What should I do ?