piraeusdatastore / piraeus-ha-controller

High Availability Controller for stateful workloads using storage provisioned by Piraeus
Apache License 2.0
15 stars 8 forks source link

Add an option to ignore Pods with additional types of storage #18

Closed WanzenBug closed 2 years ago

WanzenBug commented 2 years ago

nit: Shouldn't we add an option to ignore other types of storage. Eg. if iscsi is used we can't grantee successful fencing for them.

_Originally posted by @kvaps in https://github.com/piraeusdatastore/helm-charts/pull/13#discussion_r926784754_

kvaps commented 2 years ago

Imagine hypothetical case, when pod has two devices connected:

+----------+
|          |<---[drbd device]
|   pod    |
|          |<---[iscsi device]
+----------+

Then node where this pod is running losing network connection. The ha-controller forcefully removes this pod, and it's starting working on another node.

Then old node suddenly recovers network connection:

In fact, DRBD is protected because of quorum, but iSCSI is not, so node can continue writing on this device because it has filesystem still mounted.

WanzenBug commented 2 years ago

I'm wondering if this is already "solved", in the sense that the HA Controller doesn't remove the VolumeAttachment for other drivers. That should prevent a new Pod from attaching to all volumes, so it would never start.

kvaps commented 2 years ago

If I remember correct the in-tree storage drivers, eg. nfs, iscsi and ceph (deprecated in favor ceph-csi) do not use Kubernetes volumeattachments mechanism

WanzenBug commented 2 years ago

Perhaps the solution then is to check if a CSI driver or in-tree driver is used, and if it's in-tree refusing failover?

kvaps commented 2 years ago

Some CSI drivers does not require volumeattachments as well, eg nfs-csi: https://github.com/kubernetes-csi/csi-driver-nfs/blob/master/deploy/v4.1.0/csi-nfs-driverinfo.yaml#L7

This way I think it's quite complicated to implement this check. I think it's better to give an option to let people decide.

--evict-mixed-volume-types=<true|false>

which is not enabled by default

Of course we should ignore some known types, like ConfigMap, Secret, downwardAPI, git (currently deprecated) and so on. However I'm not sure about hostPath and local

WanzenBug commented 2 years ago

Looks like yours is the best solution.

In general, we can "ignore" all volumes which are read-only I'd say

WanzenBug commented 2 years ago

Initial implementation merged. We can certainly make the current implementation smarter: I did not go through all build-in storage types and check their read-only state, for example. But it should be good enough for now.