nutanix / helm

Nutanix Helm Charts repository
https://nutanix.github.io/helm/
MIT License
19 stars 31 forks source link

Can't mount NutanixVolume in Container #68

Closed hornet83 closed 2 years ago

hornet83 commented 2 years ago

Hi,

we have installed the nutanix-csi and nutanix-csi-snapshot:

NAME                    NAMESPACE   REVISION    UPDATED                                     STATUS      CHART                       APP VERSION
nutanix-csi             ntnx-system 1           2022-08-25 13:13:35.439657844 +0200 CEST    deployed    nutanix-csi-storage-2.5.4   2.5.1      
nutanix-csi-snapshot    ntnx-system 1           2022-08-25 13:13:12.811057445 +0200 CEST    deployed    nutanix-csi-snapshot-6.0.1  6.0.1

We can create PVCs fine, PVs get created automatically as expected.

k get pvc
NAME                                     STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS     AGE
data-wordpress-15-1661426949-mariadb-0   Bound    pvc-f9546b9d-e060-4e24-832e-c6d1cd7e3afe   8Gi        RWO            nutanix-volume   27m
data-wordpress-15-1661427441-mariadb-0   Bound    pvc-d211c320-a276-441d-b70b-092b386b74a6   8Gi        RWO            nutanix-volume   19m
wordpress-15-1661427441                  Bound    pvc-a1f77322-153f-422d-8a15-e94f230bd461   10Gi       RWO            nutanix-volume   19m
k get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                                            STORAGECLASS     REASON   AGE
pvc-a1f77322-153f-422d-8a15-e94f230bd461   10Gi       RWO            Delete           Bound    default/wordpress-15-1661427441                  nutanix-volume            19m
pvc-d211c320-a276-441d-b70b-092b386b74a6   8Gi        RWO            Delete           Bound    default/data-wordpress-15-1661427441-mariadb-0   nutanix-volume            19m
pvc-f9546b9d-e060-4e24-832e-c6d1cd7e3afe   8Gi        RWO            Delete           Bound    default/data-wordpress-15-1661426949-mariadb-0   nutanix-volume            27m

The problem though is, that the containers can't mount the volume itself (open-iscsi is installed). We tried several different applications and they all are having the same issue..

k get pods | grep wordpress
wordpress-15-1661427441-7574bc4579-q4jlz   0/1     ContainerCreating   0          21m
wordpress-15-1661427441-mariadb-0          0/1     ContainerCreating   0          21m
Events:
  Type     Reason            Age                 From               Message
  ----     ------            ----                ----               -------
  Warning  FailedScheduling  21m                 default-scheduler  0/3 nodes are available: 3 pod has unbound immediate PersistentVolumeClaims.
  Warning  FailedScheduling  21m                 default-scheduler  0/3 nodes are available: 3 pod has unbound immediate PersistentVolumeClaims.
  Normal   Scheduled         21m                 default-scheduler  Successfully assigned default/wordpress-15-1661427441-7574bc4579-q4jlz to ntnx-csiw1
  Warning  FailedMount       20m                 kubelet            MountVolume.SetUp failed for volume "pvc-a1f77322-153f-422d-8a15-e94f230bd461" : rpc error: code = Internal desc = Operation timed out: Failed to update VG: [PUT /volume_groups/{uuid}][400] PutVolumeGroupsUUID default  &{APIVersion:3.1 Code:400 Kind: MessageList:[0xc0001225b8] State:ERROR}
  Warning  FailedMount       19m                 kubelet            MountVolume.SetUp failed for volume "pvc-a1f77322-153f-422d-8a15-e94f230bd461" : rpc error: code = Internal desc = Operation timed out: Failed to update VG: [PUT /volume_groups/{uuid}][400] PutVolumeGroupsUUID default  &{APIVersion:3.1 Code:400 Kind: MessageList:[0xc0001220d8] State:ERROR}
  Warning  FailedMount       18m                 kubelet            MountVolume.SetUp failed for volume "pvc-a1f77322-153f-422d-8a15-e94f230bd461" : rpc error: code = Internal desc = Operation timed out: Failed to update VG: [PUT /volume_groups/{uuid}][400] PutVolumeGroupsUUID default  &{APIVersion:3.1 Code:400 Kind: MessageList:[0xc000123338] State:ERROR}
  Warning  FailedMount       17m                 kubelet            MountVolume.SetUp failed for volume "pvc-a1f77322-153f-422d-8a15-e94f230bd461" : rpc error: code = Internal desc = Operation timed out: Failed to update VG: [PUT /volume_groups/{uuid}][400] PutVolumeGroupsUUID default  &{APIVersion:3.1 Code:400 Kind: MessageList:[0xc00000efa8] State:ERROR}
  Warning  FailedMount       16m                 kubelet            MountVolume.SetUp failed for volume "pvc-a1f77322-153f-422d-8a15-e94f230bd461" : rpc error: code = Internal desc = Operation timed out: Failed to update VG: [PUT /volume_groups/{uuid}][400] PutVolumeGroupsUUID default  &{APIVersion:3.1 Code:400 Kind: MessageList:[0xc000123260] State:ERROR}
  Warning  FailedMount       15m                 kubelet            MountVolume.SetUp failed for volume "pvc-a1f77322-153f-422d-8a15-e94f230bd461" : rpc error: code = Internal desc = Operation timed out: Failed to update VG: [PUT /volume_groups/{uuid}][400] PutVolumeGroupsUUID default  &{APIVersion:3.1 Code:400 Kind: MessageList:[0xc0005184f8] State:ERROR}
  Warning  FailedMount       13m                 kubelet            MountVolume.SetUp failed for volume "pvc-a1f77322-153f-422d-8a15-e94f230bd461" : rpc error: code = Internal desc = Operation timed out: Failed to update VG: [PUT /volume_groups/{uuid}][400] PutVolumeGroupsUUID default  &{APIVersion:3.1 Code:400 Kind: MessageList:[0xc000518798] State:ERROR}
  Warning  FailedMount       8m1s (x2 over 17m)  kubelet            Unable to attach or mount volumes: unmounted volumes=[wordpress-data], unattached volumes=[kube-api-access-fxs6w wordpress-data]: timed out waiting for the condition
  Warning  FailedMount       76s (x7 over 19m)   kubelet            Unable to attach or mount volumes: unmounted volumes=[wordpress-data], unattached volumes=[wordpress-data kube-api-access-fxs6w]: timed out waiting for the condition
  Warning  FailedMount       60s (x5 over 12m)   kubelet            (combined from similar events): MountVolume.SetUp failed for volume "pvc-a1f77322-153f-422d-8a15-e94f230bd461" : rpc error: code = Internal desc = Operation timed out: Failed to update VG: [PUT /volume_groups/{uuid}][400] PutVolumeGroupsUUID default  &{APIVersion:3.1 Code:400 Kind: MessageList:[0xc000518df8] State:ERROR}

We are using k8s v1.23.8:

k get nodes
NAME         STATUS   ROLES               AGE   VERSION
ntnx-csim1   Ready    controlplane,etcd   57m   v1.23.8
ntnx-csiw1   Ready    worker              54m   v1.23.8
ntnx-csiw2   Ready    worker              55m   v1.23.8

Any idea why this is happening? This is a major blocker for one of our customers at the moment..

Thanks, Stefan

tuxtof commented 2 years ago

Hi @hornet83

few questions:

hornet83 commented 2 years ago

Hi @tuxtof ,

many thanks for your quick help! Starting the open-iscsi service fixed the issue.

Is this a new requirement for that version of the CSI?

Our customer had that issue only after an upgrade from an older version, using the same VMs (ubuntu 20.04) - I think the previous version was 2.3, where the snapshot part was still in the same helm chart and that was working fine with the same VMs.

Thanks!

tuxtof commented 2 years ago

No iscsi was always a prerequisite when using block persistent volumes