openebs / lvm-localpv

Dynamically provision Stateful Persistent Node-Local Volumes & Filesystems for Kubernetes that is integrated with a backend LVM2 data storage stack.
Apache License 2.0
245 stars 92 forks source link

When creating lvmvolume without check whether lvmnode matches the vgpattern #174

Open zwForrest opened 2 years ago

zwForrest commented 2 years ago

What steps did you take and what happened: kubernetes has 3 nodes, node A has vggroup vg_hdd, node B has vggroup vg_hdd, node C has vggroup vg_ssd. All nodes have the same topology label.

Pvc yaml

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: lvm-ssd
  namespace: openebs
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
  storageClassName: openebs-lvmpv-ssd

Storageclass yaml

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: openebs-lvmpv-ssd
parameters:
  storage: lvm
  vgpattern: vg_ssd*
provisioner: local.csi.openebs.io
reclaimPolicy: Delete
volumeBindingMode: Immediate

After creating the PVC, the lvmvolume may be created on nodeA or nodeB, the scheduling algorithm does not consider using vggroup information for matching. Then the volvolume will fail to create.

What did you expect to happen:

Not only topology information, but also vggroup information need to be considered when creating volvolumes. After scheduling, should check the vggroup information

The output of the following commands will help us better understand what's going on: (Pasting long output into a GitHub gist or other Pastebin is fine.)

Anything else you would like to add: [Miscellaneous information that will assist in solving the issue.]

Environment:

pawanpraka1 commented 2 years ago

After creating the PVC, the lvmvolume may be created on nodeA or nodeB,

@zwForrest, if node A nad Node B don't have a volume group of name vg_ssd*, volume should never be created on that node. Can you check the volume groups available on those nodes? Also, note that a temperary vol object may be created with the Status as Pending/failed, but it will be deleted automatically without getting transitioned into Ready state.

I would recommend to use delayed binding to leverage storage info to make the scheduling decesion by k8s. You can use volumeBindingMode: WaitForFirstConsumer in the storageclass to use that.

dsharma-dc commented 3 months ago

@abhilashshetty04 Please take a look at this one as well.

abhilashshetty04 commented 1 month ago

@zwForrest , We have created an enhancement ticket https://github.com/openebs/lvm-localpv/issues/312. It has some additional suggestions also. Please take a look.