Open Raoul555 opened 7 months ago
I suspect the problem is related to the IP addresses you're using. Can you change the iSCSI IP addresses so the iSCSI ports don't all appear to be on the same 2 subnets? In other words, define 6 unique subnets, one for each point-to-point connection, so that if you try on host A to ping the iSCSI ports connected to host B, you'll get an "No route to host" error immediately. I think each host is trying to connect to all 6 iSCSI ports and will retry for a while on each port because they all appear to be on a locally-attached subnets.
Well I've tried, and effectively without a default route, I've got a "No route to host" immediatly when node A ping a iSCSI port of node B.
But I need to set a default route. In that case the ICMP ping messages is routed to the default route, and so does not exit with "No route to host".
Does the csi driver rely only of a ping test to test which port it can use ?
So, I don't have a solution so far...
Oops, I should've seen that coming. To fix that, you can use a special route like this (on server A, using your original B and C IPs as an example):
ip route unreachable 10.14.11.202/32 # block B's iSCSI ports
ip route unreachable 10.14.12.202/32
ip route unreachable 10.14.11.203/32 # block C's iSCSI ports
ip route unreachable 10.14.12.203/32
This will cause attempts by A to reach these ports to fail immediately rather than timeout after some delay, without requiring you to delete your default route. Of course, you'll need to do something similar on B and C.
The CSI driver doesn't care about which ports are reachable per se, but the iSCSI initiator automatically discovers all six ports and the ports that are unreachable will slow things down.
If this doesn't help (or doesn't help enough) please attach a complete set of host logs (node and controller logs, and kernel messages).
Ah, that, this works. Now I'm able to provision 16 volumes in 4 minutes.
Does this duration for volumes creation seem what is expected ? Or can it be even better?
On an other topic, the volumes are not suppressed on the disk array side, when I'm deleting the kubernetes pv and pvc. Is it normal ?
Hello, volume creation time can be influenced by many factors but 16 volumes in 4 minutes seems like reasonably expected performance.
On an other topic, the volumes are not suppressed on the disk array side, when I'm deleting the kubernetes pv and pvc. Is it normal ?
CSI managed volumes are expected to be visible on the array side. If they are persisting after the PV and PVC have been deleted from the cluster, that would not be normal behavior.
Describe the bug
Mounting a volume hosted by a PowerVault Dell disk arrays in kubernetes pods takes more than 1 minutes, using seagate-exos-x-csi driver.
To Reproduce
Create a storageClass: `apiVersion: storage.k8s.io/v1 kind: StorageClass provisioner: csi-exos-x.seagate.com # Check pkg/driver.go, Required for the plugin to recognize this storage class as handled by itself. volumeBindingMode: Immediate # Prefer this value to avoid unschedulable pods (https://kubernetes.io/docs/concepts/storage/storage-classes/#volume-binding-mode) allowVolumeExpansion: true metadata: name: dell-storage # Choose the name that fits the best with your StorageClass. parameters:
Secrets name and namespace, they can be the same for provisioner, controller-publish and controller-expand sections.
csi.storage.k8s.io/provisioner-secret-name: seagate-exos-x-csi-secrets csi.storage.k8s.io/provisioner-secret-namespace: seagate csi.storage.k8s.io/controller-publish-secret-name: seagate-exos-x-csi-secrets csi.storage.k8s.io/controller-publish-secret-namespace: seagate csi.storage.k8s.io/controller-expand-secret-name: seagate-exos-x-csi-secrets csi.storage.k8s.io/controller-expand-secret-namespace: seagate csi.storage.k8s.io/fstype: ext4 # Desired filesystem pool: A # Pool to use on the IQN to provision volumes volPrefix: tools storageProtocol: iscsi # The storage interface (iscsi, fc, sas) being used for storage i/o `
Then create pod with a persistent volume: `apiVersion: v1 kind: PersistentVolumeClaim metadata: name: claim-test spec: accessModes:
apiVersion: v1 kind: Pod metadata: name: pod-test spec: nodeName: kube-tool-worker-01 containers:
The pod waits for it persistent volume to be mounted, but the PV takes mode than 1 minutes to be available.
Logs of one of seagate-exos-x-csi-node-server pod:
Description of the created pv:
Expected behavior
The PV should be available nearly immediately.
Storage System (please complete the following information):
Environment:
Additional context
My k8s cluster is installed on 3 servers. Servers to disks array are point to point connection, i.e. each server is physically directly connected to the storage.