gluster / gluster-kubernetes

GlusterFS Native Storage Service for Kubernetes
Apache License 2.0
875 stars 389 forks source link

Creating Volume FAILS (no space) when a Gluster node crashes #522

Closed Webgardener closed 5 years ago

Webgardener commented 5 years ago

HI.

I set up a 3 nodes glusterfs cluster. Gluster volumes are managed by Heketi. Everything goes fine (creation of volumes, file replication...) until a gluster node crashes. Then creation of volume fails.

Here what happens when a gluster node is down (2 nodes remaining):

  1. heketi still thinks that the failing node is part of cluster:

[root@heketi-59976dcdb7-64wbx /]# heketi-cli node list 989192b157113e3b892fbed15ef3916e Id:11795825ae27aa7f38f6d496755dc26c Cluster:989192b157113e3b892fbed15ef3916e Id:73b4ad4c78ed937577718908e123ee15 Cluster:989192b157113e3b892fbed15ef3916e Id:bb1ad190bf01963e84519788b69c2239 Cluster:989192b157113e3b892fbed15ef3916e

Then, creating new PVC fails:

persistentvolumeclaim/test2-claimname-gluster-pvc **Pending** test-gluster-sc 12m

$ kubectl -n catalyse describe pvc test2-claimname-gluster-pvc Warning ProvisioningFailed 2m19s (x41 over 12m) persistentvolume-controller Failed to provision volume with StorageClass "test-gluster-sc": failed to create volume: failed to create volume: Failed to allocate new volume: No space

It seems that gluster with Heketi is not yet fault tolerant.

nixpanic commented 5 years ago

This is expected behaviour. A Gluster Volume with replica-3 needs three storage servers to be available. If only two are reachable by Heketi, creating the volume will not be possible.

If you want to be resilient against loosing one storae server, you will need to setup a 4th one. That way there are still three available when one goes down, and Heketi is smart enough to create the volume on the remaining three.

Webgardener commented 5 years ago

Thank you.