dell / csm

Dell Container Storage Modules (CSM)
Apache License 2.0
68 stars 15 forks source link

[FEATURE]: PowerScale CSI - Allow manual update/change of export ID #1231

Open danthem opened 6 months ago

danthem commented 6 months ago

Describe the solution you'd like I have two clusters were I have set up replication of my PVCs using SyncIQ (Not using the replication module), now I need to do a failover to my secondary cluster. After updating my DNS records so that it now points at the secondary cluster I'm unable to start containers because CSI Driver is looking for exact same export ID on target cluster rather than figuring it out from path or creating a new export and updating the setting.

Replication module was not used when setting up the environment and we want to manage replication from storage admin perspective using standardized SyncIQ policies anyways, failover is handled by Superna Eyeglass which also creates identical NFS exports on target (but export ID does not necessarily match). Data is successfully synced over and made writable on the target, PVCs where the NFS export ID is identical on primary and secondary cluster works fine but for any PVCs where it does not match it does not work and containers cannot start.

Export ID seems to be stored in PV 'Volume Handle' and 'VolumeAttributes' but value can't be edited:

# * spec.persistentvolumesource: Forbidden: spec.persistentvolumesource is immutable after creation

I would like a way to handle this kind of scenario.. All the data is there, all NFS exports is there, CSI driver can communicate with the cluster but because the export ID (which is a sequential value that is not modifiable on PowerScale) does not match, I cannot run my containers. It would be great if we were able to update the export ID expected in CSI driver... Either manually or by forcing some kind of NFS export refresh.

Additional context Note that we're only failing over the PowerScale clusters, not the K8s.

coulof commented 6 months ago

This field is controlled by Kubernetes specification and is immutable. Unfortunately, this is not something we can fix.

The options we have are :

  1. Change the way we craft the volume handle so it works better with SyncIQ

    But migration of existing PVs will be a challenge

  2. Fix the CSM replication implementation so, in case of replication, the volume handle knows how to deal with 2 exports instead of creating two PVs