openebs / lvm-localpv

Dynamically provision Stateful Persistent Node-Local Volumes & Filesystems for Kubernetes that is integrated with a backend LVM2 data storage stack.
Apache License 2.0
235 stars 92 forks source link

PV fails to schedule when storage class has a single-value node label affinity #275

Closed alexanderxc closed 4 weeks ago

alexanderxc commented 7 months ago

What steps did you take and what happened: Standard deployment on a k3s cluster via either manifests or Helm chart (in default namespace) and storageClass deployed with the following configuration:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: openebs-lvmpv
allowVolumeExpansion: true
parameters:
  storage: "lvm"
  volgroup: "openebs-vg"
provisioner: local.csi.openebs.io
volumeBindingMode: WaitForFirstConsumer
allowedTopologies:
  - matchLabelExpressions:
    - key: storage-device
      values:
        - lvm

All 3 nodes in the cluster are labelled with storage-device=lvm and have vg correctly setup.

What did you expect to happen: PV are correctly scheduled and created, instead PV failed to be scheduled for availability issue.

Anything else you would like to add: Unfortunately I changed configurations before gathering additional data but:

This needs further investigation.

Environment:

abhilashshetty04 commented 5 months ago

Hi @alexanderxc , "PV are correctly scheduled and created, instead PV failed to be scheduled for availability issue." Can you please elaborate on this. Do you mean scheduling works when volumeBindingMode is Immediate ?

If yes, Can you change it back to Immediate and share logs from CSI pod.

alexanderxc commented 5 months ago

Hi @abhilashshetty04, it has been a while and I cannot confirm 100% (cannot test right now as the cluster is in use by another project) but to to clarify the above behavior:

As mentioned, it has been a while and had to proceed with the workaround. But I can attach here the configuration that generates the errore in my case - please note I was running on k3s on bare metal and each lvm is backed by a local SSD.

Mongo RS:

apiVersion: mongodbcommunity.mongodb.com/v1
kind: MongoDBCommunity
metadata:
  name: mongo
  namespace: db
spec:
  members: 3
  type: ReplicaSet
  version: "7.0.4"
  security:
    authentication:
      modes: ["SCRAM-SHA-1"]
  users:
    - name: admin
      db: admin
      passwordSecretRef: # a reference to the secret that will be used to generate the user's password
        name: admin-password
      roles:
        - name: clusterAdmin
          db: admin
        - name: userAdminAnyDatabase
          db: admin
      scramCredentialsSecretName: admin-scram
  statefulSet:
    spec:
      selector:
        matchLabels:
          app: mongodb
      template:
        metadata:
          labels:
            app: mongodb
        spec:
          affinity:
            podAntiAffinity:
              requiredDuringSchedulingIgnoredDuringExecution:
                - labelSelector:
                    matchExpressions:
                      - key: app
                        operator: In
                        values:
                          - mongodb
                  topologyKey: kubernetes.io/hostname
      volumeClaimTemplates:
        - metadata:
            name: data-volume
          spec:
            accessModes: ["ReadWriteOnce"]
            storageClassName: openebs-lvmpv
            resources:
              requests:
                storage: 20G
        - metadata:
            name: logs-volume
          spec:
            accessModes: [ "ReadWriteOnce" ]
            storageClassName: openebs-lvmpv
            resources:
              requests:
                storage: 10G

StorageClass

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: openebs-lvmpv
allowVolumeExpansion: true
parameters:
  storage: "lvm"
  volgroup: "openebs-vg"
  fsType: xfs
provisioner: local.csi.openebs.io
volumeBindingMode: WaitForFirstConsumer
allowedTopologies:
  - matchLabelExpressions:
    - key: storage-device
      values:
        - lvm

Each node was of course labeled with storage-device=lvm.

Also, please note that the allowedTopologies mentioned in the original report works correctly and allowed scheduling of Mongodb RS.

dsharma-dc commented 1 month ago

@alexanderxc if you want to use the custom label, then the csi lvm node agent need to be made aware of the topology key first if it's not the default key. Once the daemon set aware of the key the provisioning will work. Please refer the documentation here: https://openebs.io/docs/user-guides/local-storage-user-guide/local-pv-lvm/lvm-configuration to know about editing daemonset to add a new topology key. After adding the custom key to known topology keys, the provisioning works as expected.

alexanderxc commented 4 weeks ago

Thank you. I totally missed the lvm documentation page about agent topology key. It indeed works well. Issue can be closed.