Closed druesendieb closed 1 week ago
@druesendieb: Thank you for submitting this issue!
The issue is currently awaiting triage. Please make sure you have given us as much context as possible.
If the maintainers determine this is a relevant issue, they will remove the needs-triage label and respond appropriately.
We want your feedback! If you have any questions or suggestions regarding our contributing process/workflow, please reach out to us at container.storage.modules@dell.com.
Hi @druesendieb ,
To start analyzing the issue step by step, I have a couple of questions:
volumeBindingMode: Immediate
, was the volume successfully created on the array?Thanks.
Hi @adarsh-dell ,
https://github.com/kubernetes-csi/external-health-monitor
Thanks
https://github.com/kubernetes-csi/external-health-monitor
Thanks
Please provide the complete driver logs and let us know the exact steps that you are following for copying the data so it will be easy for us to reproduce the issue in our lab without any conflicts in the steps.
Already on the move, will provide more details next week.
The procedure is basically a job that mounts 2 pvcs and rsyncs the data from old to new, see the repostory of pv-migrate.
High level:
volume
with data with old storageclassvolume-temp
volume
k pv-migrate --source=volume --dest=volume-temp
to copy data to a temp pvcvolume
pvcvolume
pvc with powstore storage classk pv-migrate --source=volume-temp --dest=volume -d
to copy data from temp pvc to new powerstore pvcPlease provide the complete driver logs and let us know the exact steps that you are following for copying the data so it will be easy for us to reproduce the issue in our lab without any conflicts in the steps.
Already on the move, will provide more details next week.
The procedure is basically a job that mounts 2 pvcs and rsyncs the data from old to new, see the repostory of pv-migrate.
High level:
- have a pvc named
volume
with data with old storageclass- create a new pvc with the powerstore sc - same size, named
volume-temp
- scale down consumers of
volume
- run
k pv-migrate --source=volume --dest=volume-temp
to copy data to a temp pvc- delete
volume
pvc- create new
volume
pvc with powstore storage class- run
k pv-migrate --source=volume-temp --dest=volume -d
to copy data from temp pvc to new powerstore pvc- scale consumers up again
Thanks for the detailed information about the steps to reproduce the issue. Please check the NFS export whenever you get time because as per the code shared by me earlier csi-driver will try to get the export IPs list from the NFS export and to me it seems to be that it is not available on the NFS export that's why csi-driver is reporting this vols as abnormal in state.
Thanks, Adarsh
Hi @adarsh-dell, I got access to the UI now, lets continue:
Given the volumeBindingMode: Immediate, was the volume successfully created on the array?
On the Pstore UI i can see the file system for the nfs pvc - I see no alerts since creation, for me this looks fine. Is there anything i can check in the UI to see if this is not the case?
Can you check the export on the Pstore UI? Does the export include the IP address of the worker node where the pod is being scheduled?
Storage - File Systems - Tab NFS Exports
in the Pstore UI shows me the NFS Export titled as the PVC name, e.g. csivol-$NAME-cf0e8041e2
, the NFS Export Path (IPv4) is prefixed with the IP of the Pstore system followed by the NFS Export name, e.g. 1.2.3.4:/csivol-$NAME-cf0e8041e2
You've linked the https://github.com/kubernetes-csi/external-health-monitor, here its stated for NodeVolumeStats that a feature gate may be necessary: This feature in Kubelet is controlled by an Alpha feature gate CSIVolumeHealth.
I will try to activate this.
Additionally:
Hi @druesendieb,
As requested earlier, could you please share the driver log (controller and node pods) with us? You mentioned that this issue occurs with all NFS-backed volumes, so I am interested to see if the NFS export includes the worker nodes' IP addresses or not.
Thanks, Adarsh
/sync
link: 26117
Any update regarding sharing the logs?
@falfaroc, the ticket is being closed but feel free to re-open it with logs if the issue persists. Thanks!
Bug Description
We're currently migrating from a Unity to Powerstore storage system, so we have a cluster with both csi-unity and csi-powerstore drivers installed. With both we use nfs and iscsi storage classes.
After implementing the csi-powerstore driver and migrating the first PVCs, we encountered an issue with the published nfs PVC metrics, instead of fetching them we face errors in the driver container of the node daemonset.
Metrics from powerstore-iscsi and unity-nfs work as expected.
Logs
{"level":"info","msg":"/csi.v1.Node/NodeGetVolumeStats: REQ 0018: VolumeId=66866727-cf7a-4dd8-b15f-16ad14c055a8/PS4f022082b83d/nfs, VolumePath=/var/lib/kubelet/pods/e73489b6-17cf-4e78-bcde-59430aa8baea/volumes/kubernetes.io~csi/csivol-$NAME-cf0e8041e2/mount, XXX_NoUnkeyedLiteral={}, XXX_sizecache=0","time":"2024-07-05T11:11:19.602727737Z"}
{"level":"info","msg":"/csi.v1.Node/NodeGetVolumeStats: REP 0018: VolumeCondition=abnormal:true message:\"host csi-node-a122dec52e994b51bb2c21ee0113800e-$IP is not attached to NFS export for filesystem 66866727-cf7a-4dd8-b15f-16ad14c055a8\" , XXX_NoUnkeyedLiteral={}, XXX_sizecache=0","time":"2024-07-05T11:11:19.615126687Z"}
Screenshots
No response
Additional Environment Information
k8s 1.24
Steps to Reproduce
Configure csi-powerstore with Volume Health Monitoring enabled https://dell.github.io/csm-docs/v3/csidriver/installation/helm/powerstore/#volume-health-monitoring
Use NFS storageclass:
Create 1 PVC with nfs storage class Create Pod consuming this PVC
See errors in csi-powerstore-node driver container when fetching metrics
Expected Behavior
No errors when running NodeGetVolumeStats Kubernetes should present volume metrics from nfs volumes
CSM Driver(s)
csi-powerstore: 2.10.0
Installation Type
Helm
Container Storage Modules Enabled
No response
Container Orchestrator
RKE1
Operating System
Ubuntu 18.04