Raoul555 commented 7 months ago

Describe the bug

Mounting a volume hosted by a PowerVault Dell disk arrays in kubernetes pods takes more than 1 minutes, using seagate-exos-x-csi driver.

To Reproduce

Create a storageClass: `apiVersion: storage.k8s.io/v1 kind: StorageClass provisioner: csi-exos-x.seagate.com # Check pkg/driver.go, Required for the plugin to recognize this storage class as handled by itself. volumeBindingMode: Immediate # Prefer this value to avoid unschedulable pods (https://kubernetes.io/docs/concepts/storage/storage-classes/#volume-binding-mode) allowVolumeExpansion: true metadata: name: dell-storage # Choose the name that fits the best with your StorageClass. parameters:

Secrets name and namespace, they can be the same for provisioner, controller-publish and controller-expand sections.

csi.storage.k8s.io/provisioner-secret-name: seagate-exos-x-csi-secrets csi.storage.k8s.io/provisioner-secret-namespace: seagate csi.storage.k8s.io/controller-publish-secret-name: seagate-exos-x-csi-secrets csi.storage.k8s.io/controller-publish-secret-namespace: seagate csi.storage.k8s.io/controller-expand-secret-name: seagate-exos-x-csi-secrets csi.storage.k8s.io/controller-expand-secret-namespace: seagate csi.storage.k8s.io/fstype: ext4 # Desired filesystem pool: A # Pool to use on the IQN to provision volumes volPrefix: tools storageProtocol: iscsi # The storage interface (iscsi, fc, sas) being used for storage i/o `

Then create pod with a persistent volume: `apiVersion: v1 kind: PersistentVolumeClaim metadata: name: claim-test spec: accessModes:

ReadWriteOnce storageClassName: dell-storage resources: requests: storage: 10Mi

apiVersion: v1 kind: Pod metadata: name: pod-test spec: nodeName: kube-tool-worker-01 containers:
- image: alpine command: ["sleep", "3600"] name: pod-test volumeMounts:
mountPath: /vol name: volume volumes:
- name: volume persistentVolumeClaim: claimName: claim-test `

The pod waits for it persistent volume to be mounted, but the PV takes mode than 1 minutes to be available.

Logs of one of seagate-exos-x-csi-node-server pod:

I0320 14:53:39.483341       1 driver.go:125] === [ROUTINE REQUEST] [0] /csi.v1.Node/NodePublishVolume (49730e675962) <0s> ===
I0320 14:53:39.483348       1 driver.go:132] === [ROUTINE START] [1] /csi.v1.Node/NodePublishVolume (49730e675962) <661ns> ===
I0320 14:53:39.483365       1 node.go:192] "NodePublishVolume call" volumeName="pro_52897734d4e9ed81d0b20c5ac87"
I0320 14:53:39.483391       1 iscsiNode.go:65] "iSCSI connection info:" iqn="iqn.1988-11.com.dell:01.array.bc305b5dd35b" portals=["10.14.11.201","10.14.11.202","10.14.11.203","10.14.12.201","10.14.12.202","10.14.12.203"]
I0320 14:53:39.483401       1 iscsiNode.go:68] "LUN:" lun=13
I0320 14:53:39.483407       1 iscsiNode.go:70] "initiating ISCSI connection..."
I0320 14:53:54.909789       1 node.go:96] >>> /csi.v1.Node/NodeGetCapabilities
I0320 14:54:08.048697       1 node.go:96] >>> /csi.v1.Node/NodeGetCapabilities
I0320 14:54:22.159686       1 node.go:96] >>> /csi.v1.Node/NodeGetCapabilities
I0320 14:54:26.466144       1 node.go:96] >>> /csi.v1.Node/NodeGetCapabilities
I0320 14:54:26.851810       1 node.go:96] >>> /csi.v1.Node/NodeGetCapabilities
I0320 14:54:32.982583       1 node.go:96] >>> /csi.v1.Node/NodeGetCapabilities
I0320 14:54:33.975557       1 node.go:96] >>> /csi.v1.Identity/Probe
I0320 14:54:34.831015       1 node.go:96] >>> /csi.v1.Node/NodeGetCapabilities
I0320 14:54:53.217063       1 iscsiNode.go:128] "attached device:" path="/dev/dm-0"
I0320 14:54:53.218059       1 iscsiNode.go:159] "saving ISCSI connection info" connectorInfoPath="/var/run/csi-exos-x.seagate.com/iscsi-pro_52897734d4e9ed81d0b20c5ac87.json"
I0320 14:54:53.226942       1 storageService.go:239] Creating ext4 filesystem on device /dev/dm-0
I0320 14:54:53.243779       1 storageService.go:333] isVolumeInUse: findmnt /dev/dm-0, err=exit status 1
I0320 14:54:53.243801       1 storageService.go:149] Checking filesystem (e2fsck -n /dev/dm-0) [Publish]
I0320 14:54:53.253809       1 storageService.go:283] "successfully mounted volume" targetPath="/var/lib/kubelet/pods/a8cc4167-7d7a-4350-a2c0-203f3ab82941/volumes/kubernetes.io~csi/pvc-fbc4e528-9773-4d4e-9ed8-1d0b20c5ac87/mount"
I0320 14:54:53.253831       1 driver.go:136] === [ROUTINE END] [0] /csi.v1.Node/NodePublishVolume (49730e675962) <1m13.77048279s> ===

Description of the created pv:

Labels:          <none>
Annotations:     pv.kubernetes.io/provisioned-by: csi-exos-x.seagate.com
Finalizers:      [kubernetes.io/pv-protection external-attacher/csi-exos-x-seagate-com]
StorageClass:    hub-saas-storage
Status:          Bound
Claim:           product/claim-test-1
Reclaim Policy:  Delete
Access Modes:    RWO
VolumeMode:      Filesystem
Capacity:        10Mi
Node Affinity:   <none>
Message:         
Source:
    Type:              CSI (a Container Storage Interface (CSI) volume source)
    Driver:            csi-exos-x.seagate.com
    FSType:            ext4
    VolumeHandle:      pro_1f30c09448884c5ffc3702c5f7e##iscsi##600c0ff0006e11815105fb6501000000
    ReadOnly:          false
    VolumeAttributes:      iqn=iqn.1988-11.com.dell:01.array.bc305b5dd35b
                           pool=A
                           portals=10.14.11.201,10.14.11.202,10.14.11.203,10.14.12.201,10.14.12.202,10.14.12.203
                           storage.kubernetes.io/csiProvisionerIdentity=1710850235982-8081-csi-exos-x.seagate.com
                           storageProtocol=iscsi
                           volPrefix=prod
Events:                <none>

Expected behavior

The PV should be available nearly immediately.

Storage System (please complete the following information):

Vendor: Dell
Model: PowerVault ME5012
Firmware Version: ME5.1.2.0.1

Environment:

Kubernetes version: Server Version: v1.27.10+rke2r1
Host OS: Ubuntu 22.04.4 LTS

Additional context

My k8s cluster is installed on 3 servers. Servers to disks array are point to point connection, i.e. each server is physically directly connected to the storage.

server A has connectivity only to 10.14.11.201 and 10.14.12.201 storage ips
server B has connectivity only to 10.14.11.202 and 10.14.12.202 storage ips
server C has connectivity only to 10.14.11.203 and 10.14.12.203 storage ips

seagate-chris commented 7 months ago

I suspect the problem is related to the IP addresses you're using. Can you change the iSCSI IP addresses so the iSCSI ports don't all appear to be on the same 2 subnets? In other words, define 6 unique subnets, one for each point-to-point connection, so that if you try on host A to ping the iSCSI ports connected to host B, you'll get an "No route to host" error immediately. I think each host is trying to connect to all 6 iSCSI ports and will retry for a while on each port because they all appear to be on a locally-attached subnets.

Raoul555 commented 7 months ago

Well I've tried, and effectively without a default route, I've got a "No route to host" immediatly when node A ping a iSCSI port of node B.

But I need to set a default route. In that case the ICMP ping messages is routed to the default route, and so does not exit with "No route to host".

Does the csi driver rely only of a ping test to test which port it can use ?

So, I don't have a solution so far...

seagate-chris commented 7 months ago

Oops, I should've seen that coming. To fix that, you can use a special route like this (on server A, using your original B and C IPs as an example):

ip route unreachable 10.14.11.202/32 # block B's iSCSI ports
ip route unreachable 10.14.12.202/32
ip route unreachable 10.14.11.203/32 # block C's iSCSI ports
ip route unreachable 10.14.12.203/32

This will cause attempts by A to reach these ports to fail immediately rather than timeout after some delay, without requiring you to delete your default route. Of course, you'll need to do something similar on B and C.

The CSI driver doesn't care about which ports are reachable per se, but the iSCSI initiator automatically discovers all six ports and the ports that are unreachable will slow things down.

If this doesn't help (or doesn't help enough) please attach a complete set of host logs (node and controller logs, and kernel messages).

Raoul555 commented 7 months ago

Ah, that, this works. Now I'm able to provision 16 volumes in 4 minutes.

Does this duration for volumes creation seem what is expected ? Or can it be even better?

On an other topic, the volumes are not suppressed on the disk array side, when I'm deleting the kubernetes pv and pvc. Is it normal ?

David-T-White commented 5 months ago

Hello, volume creation time can be influenced by many factors but 16 volumes in 4 minutes seems like reasonably expected performance.

On an other topic, the volumes are not suppressed on the disk array side, when I'm deleting the kubernetes pv and pvc. Is it normal ?

CSI managed volumes are expected to be visible on the array side. If they are persisting after the PV and PVC have been deleted from the cluster, that would not be normal behavior.

Seagate / seagate-exos-x-csi

initiating ISCSI connection step takes ages #105

Secrets name and namespace, they can be the same for provisioner, controller-publish and controller-expand sections.