openebs / lvm-localpv

Dynamically provision Stateful Persistent Node-Local Volumes & Filesystems for Kubernetes that is integrated with a backend LVM2 data storage stack.
Apache License 2.0
251 stars 96 forks source link

Is there a way to mount a LV onto two pods? #281

Closed iPenx closed 7 months ago

iPenx commented 8 months ago

in the big data ecosystem, the requirement for two pods in the same task to share a storage volume is very common.

is there a way for OpenEBS LVM LocalPV to mount a LV onto two pods on the same node?

Abhinandan-Purkait commented 8 months ago

@iPenx Do you want a volume to be used by two different pods? I don't think that is a recommended use case, due to possible data corruption, due to two application writing? I will try it out and post the steps here if it's feasible.

ianroberts commented 8 months ago

Two pods on the same node, yes, that is supposed to be allowed by ReadWriteOnce - if you want to restrict mounting to one pod rather than one node you would use the new ReadWriteOncePod access mode instead.

Abhinandan-Purkait commented 8 months ago

@ianroberts IIUC @iPenx wants to have two pods on same node to access the volume, which is indeed ReadWriteOnce, so have you tried it with LVM Local PV? Did it work for you, assuming the data safety part is handled by the application?

ianroberts commented 8 months ago

No, it doesn't work for me either. I've got three pods all forced to the same node by podAffinity rules and all trying to mount the same PVC

      volumes:
      - name: shared-pv
        persistentVolumeClaim:
          claimName: coordination-service-data

The PVC is a RWO claim with my LVM Local PV storage class and it has been successfully bound to a PV, which has mounted into the first of the three pods to start. But the other two are stuck pending and syslog on the relevant node is full of

E0126 20:21:31.456258  134147 nestedpendingoperations.go:348] Operation for "{volumeName:kubernetes.io/csi/local.csi.openebs.io^pvc-92856173-8283-48b2-a064-984992b1e2f2 podName: nodeName:}" failed. No retries permitted until 2024-01-26 20:23:33.456232155 +0000 UTC m=+971103.443124428 (durationBeforeRetry 2m2s). Error: MountVolume.SetUp failed for volume "pvc-92856173-8283-48b2-a064-984992b1e2f2" (UniqueName: "kubernetes.io/csi/local.csi.openebs.io^pvc-92856173-8283-48b2-a064-984992b1e2f2") pod "coordination-service-worker-767db9b9c5-5tbsk" (UID: "2c893a1d-db28-4b46-a7f4-3a0f6ce3ac1d") : rpc error: code = Internal desc = verifyMount: device already mounted at [/var/snap/microk8s/common/var/lib/kubelet/pods/d9e4820d-f270-42fc-8514-209b7ef68b90/volumes/kubernetes.io~csi/pvc-92856173-8283-48b2-a064-984992b1e2f2/mount]
Abhinandan-Purkait commented 8 months ago

@ianroberts How about using the volume as Block mode. Would that help here?

ianroberts commented 8 months ago

Block mode would just give me a raw block device rather than a mounted filesystem.

I'm essentially trying to use a single PV as a shared filesystem between containers in multiple pods, in the same way as you might use an emptyDir to exchange data between two containers in the same pod. This is supposed to work as per the Kubernetes definition of RWO

ReadWriteOnce

the volume can be mounted as read-write by a single node. ReadWriteOnce access mode still can allow multiple pods to access the volume when the pods are running on the same node. For single pod access, please see ReadWriteOncePod.

ianroberts commented 8 months ago

Ah, apparently there's a shared setting that can be set at the StorageClass level. I'll see if that makes any difference, but it's odd that it's not enabled by default.

Abhinandan-Purkait commented 8 months ago

@ianroberts Let us know if that worked for you. There's a reason to keep that disabled by default, which is to ensure data safety. Although we might want to make it default since it's a valid use case and also now with ReadWriteOncePod to handle that bit.

ianroberts commented 8 months ago

Sadly it's not possible to edit the parameters of an existing StorageClass and I don't have any spare devices to make a new LVM volgroup to test with a new class on this cluster. For testing purposes I've got up and running by using cstor for the multiple-pods RWO volume and that works fine, I'll look into the Local PV Hostpath provisioner for the longer term.

Abhinandan-Purkait commented 8 months ago

@ianroberts StorageClass parameters cannot be edited once created, but you can always create a new storage class, or recreate(delete and create) the storage class. Storage class is just a config for a volume creation.

ianroberts commented 8 months ago

Storage class is just a config for a volume creation.

Ah, ok, this is a production cluster and I was worried that deleting the SC would affect the existing volumes that I very much don’t want to lose!

abhilashshetty04 commented 7 months ago

@ianroberts , SC dont have any bearing to already provisioned PVC. Its read only when PVC is created referring it.

ianroberts commented 7 months ago

Indeed - encouraged by the previous comments I deleted and re created the SC with the shared setting enabled and that has worked correctly.

So I guess my original issue could be closed as “not planned”, unless

… we might want to make it default since it's a valid use case and also now with ReadWriteOncePod to handle that bit.

iPenx commented 7 months ago

@ianroberts IIUC @iPenx wants to have two pods on same node to access the volume, which is indeed ReadWriteOnce, so have you tried it with LVM Local PV? Did it work for you, assuming the data safety part is handled by the application?

yes, i have tried two pods and one pvc.

---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: localpv-data
spec:
  storageClassName: localpv-lvm-ephemeral
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 100Gi

---
apiVersion: v1
kind: Pod
metadata:
  name: pod-1
spec:
  terminationGracePeriodSeconds: 1
  containers:
  - name: busybox
    image: busybox
    command:
    - cat
    args:
    - "-n"
    volumeMounts:
       - mountPath: /data
         name: data
    tty: true
  volumes:
  - name: data
    persistentVolumeClaim:
      claimName: localpv-data

---
apiVersion: v1
kind: Pod
metadata:
  name: pod-2
spec:
  terminationGracePeriodSeconds: 1
  containers:
  - name: busybox
    image: busybox
    command:
    - cat
    volumeMounts:
       - mountPath: /data
         name: data
    tty: true
  volumes:
  - name: data
    persistentVolumeClaim:
      claimName: localpv-data

storage class

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  annotations:
  name: localpv-lvm-ephemeral
parameters:
  storage: lvm
  volgroup: vg-localpv
provisioner: local.csi.openebs.io
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
allowedTopologies:
- matchLabelExpressions:
  - key: k8s.io/hostname
    values:
    - xxx

and it will get an error event

kubectl get event

58s         Warning   FailedMount             pod/pod-1                            MountVolume.SetUp failed for volume "pvc-25dd8e5c-fd39-4735-b6c9-8f0bafc43d05" : rpc error: code = Internal desc = verifyMount: device already mounted at [/var/lib/kubelet/pods/2eca2f78-bd5e-40c2-bfc8-75135e2e2585/volumes/kubernetes.io~csi/pvc-25dd8e5c-fd39-4735-b6c9-8f0bafc43d05/mount]

i think after the LV mounts for the first running pod, the next pod will get "device already" because of ReadWriteOnce mode -- in this mode, lvm-localpv agent mounts the LV on a pods' private path, like the error above "/var/lib/kubelet/pods/2eca2f78-bd5e-40c2-bfc8-75135e2e2585/volumes/kubernetes.io~csi/pvc-25dd8e5c-fd39-4735-b6c9-8f0bafc43d05/mount"

ianroberts commented 7 months ago

@iPenx yes, we've established that you need to set shared: "yes" in the StorageClass parameters section in order to get proper ReadWriteOnce semantics (mounting the volume into multiple pods on the same node). Without the shared option the LVM local PV class treats ReadWriteOnce as if it were ReadWriteOncePod.

iPenx commented 7 months ago

@ianroberts thank you for your reply so quickly. let me try it.

iPenx commented 7 months ago

it worked. thank you all.

add shared: "yes" into parameters.

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: localpv-lvm-ephemeral
provisioner: local.csi.openebs.io
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
parameters:
  storage: lvm
  volgroup: vg-localpv
  shared: "yes"
allowVolumeExpansion: true
allowedTopologies:
- matchLabelExpressions:
  - key: k8s.io/hostname
    values:
    - xxx

i closed this issue.