dell / csm

Dell Container Storage Modules (CSM)
Apache License 2.0
71 stars 16 forks source link

[BUG]: Volume detachment error on pod restart #541

Closed thecloudgarage closed 1 year ago

thecloudgarage commented 2 years ago

Bug Description

Copied values.yaml file as my-powerstore-settings.yaml.

SCENARIO-1: FAILED SCENARIO Changed the prefix settings volume name prefix Original: volumeNamePrefix: csivol Changed to: volumeNamePrefix: eksa1-vol

Result: PVC is created successfully and can be seen in PowerStore console MySQL pod is also created successfully However, if I manually delete the MySQL pod, the recreated MySQL pod is stuck in containercreating mode and describe pod shows a multi-attach volume error f

Tried it almost 10 times of manually deleting PVC/PV and everything else, but failure continues to happen as described above

Logs in the attacher container of powerstore controller clearly indicate the volume is still bound to previous MySQL pod id and is not detatched (attachdetatch error)

SCENARIO-2: WORKS FINE SEAMLESSLY

Read some issue regarding unity-xt prefix naming issue., so decided to give it a go.

The only change performed was the volume prefix in the my-powerstore-settings.yaml was changed from eksa1 to c4-eksa1

And then all the above steps of creating PVC, creating MYSQL pod, deleting MYSQL pod was successfully reattaching the PV/PVC

I wonder what is going on with the volume name prefix,

thanks,

Ambar.

Logs

I1111 08:30:14.053152 1 csi_handler.go:231] Error processing "csi-c0a4fdcf4b94db84041233d64f8a65204de5c1845ee126e2bb353979006a94a6": failed to detach: rpc error: code = NotFound desc = host with k8s node ID 'eksa-node-3abbdb2dc3a243888043c617e1a85560-172.24.167.134' not found I1111 08:30:14.063103 1 connection.go:186] GRPC response: {} I1111 08:30:14.063130 1 connection.go:187] GRPC error: rpc error: code = NotFound desc = host with k8s node ID 'eksa-node-3abbdb2dc3a243888043c617e1a85560-172.24.167.134' not found I1111 08:30:14.063142 1 csi_handler.go:604] Saving detach error to "csi-b852d5ca462e8f1c93342093b30f39bce79cca7e0727fdfc2c8ea142d56e8a9f" I1111 08:30:14.069054 1 controller.go:165] Ignoring VolumeAttachment "csi-b852d5ca462e8f1c93342093b30f39bce79cca7e0727fdfc2c8ea142d56e8a9f" change

Screenshots

No response

Additional Environment Information

No response

Steps to Reproduce

Change the default volume name prefix in settings to eksa1 instead of csivol

Expected Behavior

Persistence should not be failing on pod recreation

CSM Driver(s)

CSI driver for Powerstore v2.2.0

Installation Type

No response

Container Storage Modules Enabled

No response

Container Orchestrator

k8s 1.22

Operating System

ubuntu

csmbot commented 2 years ago

@thecloudgarage: Thank you for submitting this issue!

The issue is currently awaiting triage. Please make sure you have given us as much context as possible.

If the maintainers determine this is a relevant issue, they will remove the needs-triage label and assign an appropriate priority label.


We want your feedback! If you have any questions or suggestions regarding our contributing process/workflow, please reach out to us at karavi@dell.com.

spriya-m commented 2 years ago

The issue is being triaged. We will update the issue soon.

AkshaySainiDell commented 2 years ago

@thecloudgarage, We installed csi-powerstore v2.2.0 on k8s 1.22 cluster, but could not replicate the issue.

  1. We modified volumeNamePrefix to "eksa1-vol" before driver installation
  2. Created a pvc -> waited for pv to be created and pvc to be bound
  3. Created a deployment with above pvc -> pod went into running state
  4. Manually deleted the pod -> a new pod was created and successfully went into running sate

For further investigation, could you please share the logs of driver, attacher and provisioner containers.

We noticed that the node name in the shared logs begins with "eksa-node", could you please confirm if the nodeNamePrefix was also changed during driver installation ?

I1111 08:30:14.053152 1 csi_handler.go:231] Error processing "csi-c0a4fdcf4b94db84041233d64f8a65204de5c1845ee126e2bb353979006a94a6": failed to detach: rpc error: code = NotFound desc = host with k8s node ID 'eksa-node-3abbdb2dc3a243888043c617e1a85560-172.24.167.134' not found I1111 08:30:14.063103 1 connection.go:186] GRPC response: {}

spriya-m commented 1 year ago

@thecloudgarage , could you please provide the necessary information as requested in the previous comment?

AkshaySainiDell commented 1 year ago

@thecloudgarage , We could not reproduce the issue with the steps provided, could you please provide the necessary information as requested above?

AkshaySainiDell commented 1 year ago

Connected with @thecloudgarage, he needs some time to replicate the issue again in his setup and get relevant info. We are waiting for him to replicate and share necessary info.

AkshaySainiDell commented 1 year ago

Closing this issue since it has not be updated in some time. Please feel free to reopen the issue if need be. Thanks!