Open jellehelsen opened 2 years ago
What version of Kubernetes is this? I ran into this on Kubernetes 1.23, which the HPE CSI Driver doesn't support yet.
Kubernetes v1.21.4-gke.201
Ok thanks for the info. We've reproduced this in our environment and currently investigating what is going on.
We've tracked this down to "Cross-namespace owner references is disallowed by design.", you should see this event being emitted in the "hpe-nfs" Namespace
on your cluster:
5m17s Warning OwnerRefInvalidNamespace persistentvolumeclaim/hpe-nfs-af414f04-ae58-4dcc-a06a-c7476bf0a6ee ownerRef [zalando.org/v1/EphemeralVolumeClaim, namespace: hpe-nfs, name: my-claim-2, uid: fbc152e2-a93e-45eb-9d26-7382586bec28] does not exist in namespace "hpe-nfs"
NFS servers, backing PersistentVolumeClaims
and all the associated configuration are being setup in the "hpe-nfs" Namespace
by default.
That said, we have a workaround for this. In the StorageClass
, you can set nfsNamespace
to the Namespace
name of where EphemeralVolumeClaims
are being created. In your example above:
---
allowVolumeExpansion: true
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
labels:
environment: int
name: hpe-standard
parameters:
csi.storage.k8s.io/controller-expand-secret-name: nimble-storage
csi.storage.k8s.io/controller-expand-secret-namespace: nimble-storage
csi.storage.k8s.io/controller-publish-secret-name: nimble-storage
csi.storage.k8s.io/controller-publish-secret-namespace: nimble-storage
csi.storage.k8s.io/fstype: xfs
csi.storage.k8s.io/node-publish-secret-name: nimble-storage
csi.storage.k8s.io/node-publish-secret-namespace: nimble-storage
csi.storage.k8s.io/node-stage-secret-name: nimble-storage
csi.storage.k8s.io/node-stage-secret-namespace: nimble-storage
csi.storage.k8s.io/provisioner-secret-name: nimble-storage
csi.storage.k8s.io/provisioner-secret-namespace: nimble-storage
description: Volume created by the HPE CSI Driver for Kubernetes
destroyOnDelete: "true"
nfsResources: "true"
nfsNamespace: default
provisioner: csi.hpe.com
reclaimPolicy: Delete
volumeBindingMode: Immediate
However, this might not be practical for several reasons and for that we have the ability to allow users confined to namespaced objects to override certain parameters.
So, this StorageClass
:
---
allowVolumeExpansion: true
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
labels:
environment: int
name: hpe-standard
parameters:
csi.storage.k8s.io/controller-expand-secret-name: nimble-storage
csi.storage.k8s.io/controller-expand-secret-namespace: nimble-storage
csi.storage.k8s.io/controller-publish-secret-name: nimble-storage
csi.storage.k8s.io/controller-publish-secret-namespace: nimble-storage
csi.storage.k8s.io/fstype: xfs
csi.storage.k8s.io/node-publish-secret-name: nimble-storage
csi.storage.k8s.io/node-publish-secret-namespace: nimble-storage
csi.storage.k8s.io/node-stage-secret-name: nimble-storage
csi.storage.k8s.io/node-stage-secret-namespace: nimble-storage
csi.storage.k8s.io/provisioner-secret-name: nimble-storage
csi.storage.k8s.io/provisioner-secret-namespace: nimble-storage
description: Volume created by the HPE CSI Driver for Kubernetes
destroyOnDelete: "true"
nfsResources: "true"
allowOverrides: nfsNamespace
provisioner: csi.hpe.com
reclaimPolicy: Delete
volumeBindingMode: Immediate
Would allow users to create PersistentVolumeClaims
with the following annotation:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: my-pvc
annotations:
csi.hpe.com/nfsNamespace: this-is-my-namespace
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100Gi
So, the most practical way to go about this would be to change the "EVC" Operator to add the annotation from the calling namespace to underlying "PVC".
You can learn more about allowOverrides
and allowMutations
supported by the HPE CSI Driver on SCOD.
Thanks for that quick workaround @datamattsson !
Since this only happens when there is an owner reference set on the original PVC I can only conclude that the NFS PVC gets the owner reference copied from the original PVC. IMHO it would be better if the owner reference of the NFS PVC should point to the original PVC and always be present, or never be present at all. And if an owner reference is used, the NFS resources should be created in the original namespace by default.
The operator in the case above is obviously a stub I used to reproduce the issue, since I can't share the operator with which I originally encountered the issue. In this case we might be able to get the developer of the operator to customize the templates used by the operator, but this will add extra maintenance. This may not be possible with every operator we use.
Thanks for verifying. There are multiple ways this can play out. The origin PVC ownerReference
should not be copied to the NFS PVC in the first place, the fact that we're using a different PVC to serve the origin PVC is an implementation detail that shouldn't be anyone's concern in the first place. I'm not too familiar with the implementation details but I'm assuming we're simply copying the origin PVC as-is and inject a few parameters, it seems we need to truncate ownerReference
in that transformation.
There's also the behavior of having the NFS resources deployed automatically in the origin Namespace
when ownerReference
is encountered. This might be implemented as a behavior that a Kubernetes admin might want to control per StorageClass
so we have to wait and see what Eng/PM deem the best foot forward.
When a PVC is created by an operator (and thus has a metadata.ownerReferences field) with a storageClass that has nfsResources set to "true" the provision of the nfs volume fails. This results in the original PVC staying in the 'Pending' state.
This only happens when nfsResources is set to "true".
StorageClass:
PersistentVolumeClaim:
csi-provisioner.log csi-driver.log