sergelogvinov / proxmox-csi-plugin

Proxmox CSI Plugin
Apache License 2.0
287 stars 25 forks source link

Unable to get storage status #200

Closed vehagn closed 3 months ago

vehagn commented 4 months ago

Thank you for creating this CSI plugin, it was just what I was looking for!

Bug Report

Unable to get capacity of volumes

I0602 10:54:08.501588       1 controller.go:496] GetCapacity: region=homelab, zone=abel, storageName=data
E0602 10:54:14.607396       1 controller.go:513] GetCapacity: failed to get storage status: 400 Parameter verification failed.
I0602 10:55:02.344724       1 controller.go:484] GetCapacity: called with args {"accessible_topology":{"segments":

I'm unsure of how to debug this issue and would greatly appreciate any pointers on how to resolve it.

Description

I'm running Proxmox 8.2.2 and Kubernetes on a Debian 12 VM.

image

I had trouble getting Proxmox CCM to work, so I labelled my node manually:

kubectl label no k8s-ctrl-01 topology.kubernetes.io/region=homelab
kubectl label no k8s-ctrl-01 topology.kubernetes.io/zone=abel

and installed Proxmox CSI plugin from https://raw.githubusercontent.com/sergelogvinov/proxmox-csi-plugin/v0.6.1/docs/deploy/proxmox-csi-plugin-release.yml

with the following config

apiVersion: v1
kind: Secret
metadata:
  name: proxmox-csi-plugin
  namespace: csi-proxmox
stringData:
  config.yaml: |
    clusters:
    - url: https://192.168.1.62:8006/api2/json
      insecure: true
      token_id: "kubernetes-csi@pve!csi"
      token_secret: "<secret>"
      region: homelab

I figured I had to cluster my one Proxmox node for this to work so I did (Is this necessary?)

image

I created the following StorageClass

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: local-zfs
allowVolumeExpansion: true
parameters:
  csi.storage.k8s.io/fstype: zfs
  storage: data
provisioner: csi.proxmox.sinextra.dev
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer

but I'm unsure if the name should be the same as the Proxmox volume.

image

Trying to run a variant of your example pod

apiVersion: v1
kind: Pod
metadata:
  name: test
  namespace: pve-csi
spec:
  tolerations:
    - effect: NoSchedule
      key: node-role.kubernetes.io/control-plane
  containers:
    - name: alpine
      image: alpine
      command: ["sleep","6000"]
      volumeMounts:
        - name: pvc
          mountPath: /mnt
  terminationGracePeriodSeconds: 1
  volumes:
    - name: pvc
      ephemeral:
        volumeClaimTemplate:
          metadata:
            labels:
              type: pvc-volume
          spec:
            accessModes: [ "ReadWriteOnce" ]
            storageClassName: local-zfs
            resources:
              requests:
                storage: 1Gi

it's unable to be scheduled due to not enough free storage.

FailedScheduling  109s  default-scheduler  0/1 nodes are available:
 waiting for ephemeral volume controller to create the persistentvolumeclaim "test-pvc".
 preemption: 0/1 nodes are available: 1 Preemption is not helpful for scheduling.                                                                                                         
Warning  FailedScheduling  107s  default-scheduler  0/1 nodes are available: 
1 node(s) did not have enough free storage. preemption: 0/1 nodes are available: 
1 Preemption is not helpful for scheduling.  

Logs

Controller:

❯ k logs -n csi-proxmox proxmox-csi-plugin-controller-58986cf656-mc72v -c proxmox-csi-plugin-controller
I0602 10:54:01.633620       1 main.go:52] Driver version 0.4.0, GitVersion v0.6.1, GitCommit ac1ef92
I0602 10:54:01.633808       1 main.go:53] Driver CSI Spec version: 1.9.0
I0602 10:54:01.634003       1 merged_client_builder.go:163] Using in-cluster namespace
I0602 10:54:01.634156       1 merged_client_builder.go:121] Using in-cluster configuration
I0602 10:54:01.635918       1 main.go:113] Listening for connection on address: &net.UnixAddr{Name:"/csi/csi.sock", Net:"unix"}
I0602 10:54:01.869224       1 identity.go:81] Probe: called
I0602 10:54:01.871502       1 identity.go:38] GetPluginInfo: called
I0602 10:54:01.873012       1 identity.go:48] GetPluginCapabilities: called
I0602 10:54:01.875233       1 controller.go:276] ControllerGetCapabilities: called with args {}
I0602 10:54:02.143752       1 identity.go:81] Probe: called
I0602 10:54:02.145600       1 identity.go:38] GetPluginInfo: called
I0602 10:54:02.147215       1 identity.go:48] GetPluginCapabilities: called
I0602 10:54:02.149159       1 controller.go:276] ControllerGetCapabilities: called with args {}
I0602 10:54:02.344830       1 controller.go:484] GetCapacity: called with args {"accessible_topology":{"segments":{"topology.kubernetes.io/region":"homelab","topology.kubernetes.io/zone":"abel"}},"parameters":{"csi.storage.k8s.io/fstype":"zfs","storage":"data"},"volume_capabilities":[{"AccessType":{"Mount":{}},"access_mode":{}}]}
I0602 10:54:02.345008       1 controller.go:496] GetCapacity: region=homelab, zone=abel, storageName=data
I0602 10:54:02.462630       1 identity.go:81] Probe: called
I0602 10:54:02.464741       1 identity.go:38] GetPluginInfo: called
I0602 10:54:02.466623       1 identity.go:48] GetPluginCapabilities: called
I0602 10:54:02.468569       1 controller.go:276] ControllerGetCapabilities: called with args {}
I0602 10:54:02.469726       1 controller.go:276] ControllerGetCapabilities: called with args {}
I0602 10:54:02.724141       1 identity.go:38] GetPluginInfo: called
E0602 10:54:08.492138       1 controller.go:513] GetCapacity: failed to get storage status: 400 Parameter verification failed.
I0602 10:54:08.501505       1 controller.go:484] GetCapacity: called with args {"accessible_topology":{"segments":{"topology.kubernetes.io/region":"homelab","topology.kubernetes.io/zone":"abel"}},"parameters":{"csi.storage.k8s.io/fstype":"zfs","storage":"data"},"volume_capabilities":[{"AccessType":{"Mount":{}},"access_mode":{}}]}
I0602 10:54:08.501588       1 controller.go:496] GetCapacity: region=homelab, zone=abel, storageName=data
E0602 10:54:14.607396       1 controller.go:513] GetCapacity: failed to get storage status: 400 Parameter verification failed.
I0602 10:55:02.344724       1 controller.go:484] GetCapacity: called with args {"accessible_topology":{"segments":{"topology.kubernetes.io/region":"homelab","topology.kubernetes.io/zone":"abel"}},"parameters":{"csi.storage.k8s.io/fstype":"zfs","storage":"data"},"volume_capabilities":[{"AccessType":{"Mount":{}},"access_mode":{}}]}

Node:

❯ k logs -n csi-proxmox proxmox-csi-plugin-node-7xdwl -c proxmox-csi-plugin-node
I0602 10:54:01.383586       1 main.go:54] Driver version 0.4.0, GitVersion v0.6.1, GitCommit ac1ef92
I0602 10:54:01.383932       1 main.go:55] Driver CSI Spec version: 1.9.0
I0602 10:54:01.383950       1 main.go:83] Building kube configs for running in cluster...
I0602 10:54:01.396491       1 mount_linux.go:282] Detected umount with safe 'not mounted' behavior
I0602 10:54:01.396822       1 main.go:140] Listening for connection on address: &net.UnixAddr{Name:"/csi/csi.sock", Net:"unix"}
I0602 10:54:01.621169       1 identity.go:38] GetPluginInfo: called
I0602 10:54:01.844661       1 identity.go:38] GetPluginInfo: called
I0602 10:54:02.069367       1 node.go:532] NodeGetInfo: called with args {}
I0602 10:54:02.069836       1 round_trippers.go:466] curl -v -XGET  -H "Accept: application/json, */*" -H "User-Agent: proxmox-csi-node/v0.0.0 (linux/amd64) kubernetes/$Format" -H "Authorization: Bearer <masked>" 'https://10.96.0.1:443/api/v1/nodes/k8s-ctrl-01'
I0602 10:54:02.071330       1 round_trippers.go:510] HTTP Trace: Dial to tcp:10.96.0.1:443 succeed
I0602 10:54:02.085960       1 round_trippers.go:553] GET https://10.96.0.1:443/api/v1/nodes/k8s-ctrl-01 200 OK in 16 milliseconds
I0602 10:54:02.086014       1 round_trippers.go:570] HTTP Statistics: DNSLookup 0 ms Dial 1 ms TLSHandshake 7 ms ServerProcessing 5 ms Duration 16 ms
I0602 10:54:02.086024       1 round_trippers.go:577] Response Headers:
I0602 10:54:02.086037       1 round_trippers.go:580]     Content-Type: application/json
I0602 10:54:02.086044       1 round_trippers.go:580]     X-Kubernetes-Pf-Flowschema-Uid: 524a3a7e-aef2-4db2-ad47-04e7e4a52893
I0602 10:54:02.086049       1 round_trippers.go:580]     X-Kubernetes-Pf-Prioritylevel-Uid: 43a8cdaf-7d68-437e-a340-a2dbef62d7e6
I0602 10:54:02.086054       1 round_trippers.go:580]     Date: Sun, 02 Jun 2024 10:54:02 GMT
I0602 10:54:02.086058       1 round_trippers.go:580]     Audit-Id: 0f6437ba-fc63-46ed-82b8-217daddbecaf
I0602 10:54:02.086061       1 round_trippers.go:580]     Cache-Control: no-cache, private
I0602 10:54:02.086605       1 request.go:1212] Response Body: {"kind":"Node","apiVersion":"v1","metadata":{"name":"k8s-ctrl-01","uid":"6fdb0a6c-f129-41eb-8c83-31b2a2728357","resourceVersion":"430618","creationTimestamp":"2024-06-s...}":{" [truncated 9226 chars]

Environment

vehagn commented 4 months ago

I think I managed to fix it myself. The StoreageClass was wrong

I changed it to

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: proxmox-data
allowVolumeExpansion: true
parameters:
  # I misunderstood this to be the backing volume filesystem, not the wanted filesystem.
  csi.storage.k8s.io/fstype: xfs
  # Changed this match an existing store ID. I misunderstood this to be the type of storage used,
  # not the ID of a storage device
  storage: local-zfs
  cache: writethrough
  ssd: "true"
mountOptions:
  - noatime
provisioner: csi.proxmox.sinextra.dev
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
vehagn commented 3 months ago

Closing this as I was able to resolve the issue by reading the documentation more thoroughly as explained in my previous comment.