Open kmizumar opened 1 month ago
I guess one of the challenges is that K8s itself does not have any way to enforce read-only access. Even the ReadOnlyMany (ROX) access mode of a PVC does not guarantee that.
Even if the access modes are specified as ReadWriteOnce, ReadOnlyMany, or ReadWriteMany, they don't set any constraints on the volume. For example, even if a PersistentVolume is created as ReadOnlyMany, it is no guarantee that it will be read-only. https://kubernetes.io/docs/concepts/storage/persistent-volumes/
I can't remember if the cross-namespace sharing allows you to specify a different storage class on the second PVC? In that case you could create a storage class for it that uses the mountOptions parameter to specify ro NFS mount. This would guarantee read-only access.
And just in general, snapshots can help to keep a specific state and then return to it with no or very little overhead. Either as a single snapshot to protect the desired state and you can revert to it within seconds if needed (revert to a snapshot in-place currently has to be done on the storage system itself, stay tuned for options to control this from K8s itself). Or as a versioning mechanism if your data set undergoes (desired) changes over time and you want to keep multiple versions of it (and you can then clone from the snapshot to get a new PVC with the point in time data of the snapshot).
Describe the solution you'd like The documentation clearly states that the current implementation of Astra Trident is not possible to prevent destination namespaces from writing to the shared volumes. However, it would be appreciated if the side exposing the PVC could restrict access to read-only from the destination namespace to prevent unintended data modifications.
Describe alternatives you've considered Even if we request the users of the shared volume not to write to it and thoroughly ensure they are aware of this, humans are prone to mistakes. Therefore, it is desirable for the system to be able to guarantee this securely.
Additional context In machine learning workloads, it is common to reference a large model as a master from multiple tasks. If this master is unintentionally corrupted, the computational cost to reconstruct it to its original state is enormous. Therefore, a feature that allows it to be shared as read-only would be helpful.