Open timbrd opened 7 years ago
There is on-going discussion on the heketi project about how it behaves when one or more nodes are in a failure state. So yes, you've hit a hot-button issue. :) I'll let one of @raghavendra-talur or @MohamedAshiqrh comment on this one further.
@timbrd Hi, For now all the volume create in pvc creates replica 3 volumes. Now in kube 1.6 and origin 1.6 we have added the volume type option. In which you can specify replica:2 which just needs two nodes to be up and running.
I have encountered the same issue.
Is there any workaround to temporarily remove the failure mount and start over?
Like this one, I would like to remove all those LVMs on the device sdb
and start over.
$ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sdb 8:16 0 3.7T 0 disk
├─vg_54fd064328efff4c9addedcc02ddad63-tp_5abf7b2064014b8dbadadc500b76fb88_tmeta 252:0 0 12M 0 lvm
│ └─vg_54fd064328efff4c9addedcc02ddad63-tp_5abf7b2064014b8dbadadc500b76fb88-tpool 252:2 0 2G 0 lvm
│ ├─vg_54fd064328efff4c9addedcc02ddad63-tp_5abf7b2064014b8dbadadc500b76fb88 252:3 0 2G 0 lvm
│ └─vg_54fd064328efff4c9addedcc02ddad63-brick_5abf7b2064014b8dbadadc500b76fb88 252:4 0 2G 0 lvm
└─vg_54fd064328efff4c9addedcc02ddad63-tp_5abf7b2064014b8dbadadc500b76fb88_tdata 252:1 0 2G 0 lvm
└─vg_54fd064328efff4c9addedcc02ddad63-tp_5abf7b2064014b8dbadadc500b76fb88-tpool 252:2 0 2G 0 lvm
├─vg_54fd064328efff4c9addedcc02ddad63-tp_5abf7b2064014b8dbadadc500b76fb88 252:3 0 2G 0 lvm
└─vg_54fd064328efff4c9addedcc02ddad63-brick_5abf7b2064014b8dbadadc500b76fb88 252:4 0 2G 0 lvm
I have a 3 node openshift cluster and have installed gluster-kubernetes on it. It works flawlessly unless a node is failing. I shutdown one of the 3 machines, but openshift / kubernetes does not notice that one of the gluster pods is not reachable anymore. The existing volumes are still mounted on the pods and can be accessed without problems, heketi tries to create a brick on the failed node though after creating a new pvc. This obviously does not work, so heketi cancels the creation of the new volume.
Is this a reasonable behavious of heketi? IMO, since there are still 2 of 3 active gluster pods, it should still be able to create a new volume?