Closed andyzhangx closed 3 years ago
@msau42 This might be the thing we talked about offline.
@andyzhangx , we are facing this issue https://github.com/kubernetes/kubernetes/issues/97031 when doing subpath tests on Azurefile. And this feature might raise this issue to not only subpath volume but general volume, i.e. if user mount a specific folder and somehow the folder get deleted in the Azure console. The pod could get stuck in terminating... We should try to investigate if that's going to be a problem when test this.
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten
.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten
actually this is already supported in both PV/PVC and inline volume configs in latest version:
csi:
driver: file.csi.azure.com
readOnly: false
volumeHandle: unique-volumeid # make sure it's a unique id in the cluster
volumeAttributes:
shareName: sharename/dirname
nodeStageSecretRef:
name: secret
namespace: default
volumes:
- name: persistent-storage
csi:
driver: file.csi.azure.com
volumeAttributes:
shareName: sharename/dirname
secretName: secret
It is possible to use also a existing NFS file share with csi? And which value are valid for volumeHandle parameters? The id of nfs share for example? I use the follow on my aks configuration v1.22.4:
pvc-azurefile-nfs.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: pv-azurefile
spec:
accessModes:
- ReadWriteMany
storageClassName: ""
resources:
requests:
storage: 1Ti
And for pv: pv-azurefile-nfs.yaml
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-azurefile
spec:
capacity:
storage: 1Ti
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain # if set as "Delete" file share would be removed in pvc deletion
csi:
driver: file.csi.azure.com
readOnly: false
volumeHandle: /subscriptions/f***********/resourceGroups/***********/providers/Microsoft.Storage/storageAccounts/***********/fileServices/default/fileshares/nfs
volumeAttributes:
resourceGroup: ***********
storageAccount: ***********
shareName: ***********
protocol: ***********
I have also in good state the resources:
$ ./kubectl get pvc --kubeconfig kubeconfig
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
pv-azurefile Bound pv-azurefile 1Ti RWX 18s
$ ./kubectl get pv --kubeconfig kubeconfig
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pv-azurefile 1Ti RWX Retain Bound default/pv-azurefile 17s
And if I create a pod I can see the share and its contents, but I have for all the pod the follow warnings:
Warning FailedMount 10m (x5 over 10m) kubelet MountVolume.MountDevice failed for volume "pv-azurefile" : kubernetes.io/csi: attacher.MountDevice failed to create newCsiDriverClient: driver name file.csi.azure.com not found in the list of registered CSI drivers
How can avoid this warning?
@vot4anto you have not installed azure file csi driver correctly, follow guide to provide logs: https://github.com/kubernetes-sigs/azurefile-csi-driver/blob/master/docs/csi-debug.md#case2-volume-mountunmount-failed
And are you using taints to prevent azure file driver running on that node?
I don't install azure file csi driver because it is already installed with v.1.22.4
./kubectl --kubeconfig kubeconfig version
Client Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.0", GitCommit:"ab69524f795c42094a6630298ff53f3c3ebab7f4", GitTreeState:"clean", BuildDate:"2021-12-07T18:16:20Z", GoVersion:"go1.17.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.4", GitCommit:"b695d79d4f967c403a96986f1750a35eb75e75f1", GitTreeState:"clean", BuildDate:"2021-11-18T19:30:35Z", GoVersion:"go1.16.10", Compiler:"gc", Platform:"linux/amd64"}
And I check for the Storage Classes
./kubectl --kubeconfig kubeconfig get sc
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
azurefile file.csi.azure.com Delete Immediate true 12m
azurefile-csi file.csi.azure.com Delete Immediate true 12m
azurefile-csi-premium file.csi.azure.com Delete Immediate true 12m
azurefile-premium file.csi.azure.com Delete Immediate true 12m
default (default) disk.csi.azure.com Delete WaitForFirstConsumer true 12m
managed disk.csi.azure.com Delete WaitForFirstConsumer true 12m
managed-csi disk.csi.azure.com Delete WaitForFirstConsumer true 12m
managed-csi-premium disk.csi.azure.com Delete WaitForFirstConsumer true 12m
managed-premium disk.csi.azure.com Delete WaitForFirstConsumer true 12m
$ ./kubectl --kubeconfig kubeconfig get po -o wide -n kube-system
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
azure-ip-masq-agent-bc6mn 1/1 Running 0 12m 10.0.0.5 aks-system-12353726-vmss000000 <none> <none>
cloud-node-manager-d6p5h 1/1 Running 0 12m 10.0.0.5 aks-system-12353726-vmss000000 <none> <none>
coredns-845757d86-hvm6n 1/1 Running 0 15m 10.0.0.10 aks-system-12353726-vmss000000 <none> <none>
coredns-845757d86-w7fg7 1/1 Running 0 11m 10.0.0.26 aks-system-12353726-vmss000000 <none> <none>
coredns-autoscaler-7d56cd888-9kjkz 1/1 Running 0 15m 10.0.0.27 aks-system-12353726-vmss000000 <none> <none>
csi-azuredisk-node-rcpq2 3/3 Running 0 12m 10.0.0.5 aks-system-12353726-vmss000000 <none> <none>
csi-azurefile-node-rgx86 3/3 Running 0 12m 10.0.0.5 aks-system-12353726-vmss000000 <none> <none>
kube-proxy-9mj2k 1/1 Running 0 12m 10.0.0.5 aks-system-12353726-vmss000000 <none> <none>
metrics-server-749c96b7cc-m2l7h 1/1 Running 0 15m 10.0.0.16 aks-system-12353726-vmss000000 <none> <none>
tunnelfront-9dcc8c99d-2nqhd 1/1 Running 0 15m 10.0.0.39 aks-system-12353726-vmss000000 <none> <none>
I can't see here the same resources in the link that you sent to me, maybe for a newer version ok aks?
BTW, I created the sc for nfs:
storageclass-azurefile-nfs.yaml
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: azurefile-csi-nfs
provisioner: file.csi.azure.com
parameters:
protocol: nfs
skuName: Premium_LRS
storageAccount: ******
reclaimPolicy: Retain
volumeBindingMode: Immediate
allowVolumeExpansion: true
./kubectl --kubeconfig kubeconfig get sc -o wide
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
azurefile file.csi.azure.com Delete Immediate true 26m
azurefile-csi file.csi.azure.com Delete Immediate true 26m
azurefile-csi-nfs file.csi.azure.com Retain Immediate true 3s
azurefile-csi-premium file.csi.azure.com Delete Immediate true 26m
azurefile-premium file.csi.azure.com Delete Immediate true 26m
default (default) disk.csi.azure.com Delete WaitForFirstConsumer true 26m
managed disk.csi.azure.com Delete WaitForFirstConsumer true 26m
managed-csi disk.csi.azure.com Delete WaitForFirstConsumer true 26m
managed-csi-premium disk.csi.azure.com Delete WaitForFirstConsumer true 26m
managed-premium disk.csi.azure.com Delete WaitForFirstConsumer true 26m
And after create the PV and PVC of the size of the share
cat pv-azurefile-nfs.yaml
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-azurefile
spec:
storageClassName: azurefile-csi-nfs
capacity:
storage: 1Ti
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain # if set as "Delete" file share would be removed in pvc deletion
csi:
driver: file.csi.azure.com
readOnly: false
# make sure this volumeid is unique in the cluster
# `#` is not allowed in self defined volumeHandle
# take from terraform.tfstate "type": "azurerm_storage_share"
volumeHandle: /subscriptions/**********************/
volumeAttributes:
resourceGroup: *********
storageAccount: *********
shareName: *******
protocol: nfs
./kubectl --kubeconfig kubeconfig get pv -o wide
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE VOLUMEMODE
pv-azurefile 1Ti RWX Retain Bound default/pv-azurefile azurefile-csi-nfs 17m Filesystem
$ cat pvc-azurefile-nfs.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: pv-azurefile
spec:
accessModes:
- ReadWriteMany
storageClassName: azurefile-csi-nfs
resources:
requests:
storage: 1Ti
./kubectl --kubeconfig kubeconfig get pvc -o wide
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE VOLUMEMODE
pv-azurefile Bound pv-azurefile 1Ti RWX azurefile-csi-nfs 4s Filesystem
But when I create the pod I can see:
Normal TriggeredScaleUp 4m20s cluster-autoscaler pod triggered scale-up: [{aks-rm32-12871470-vmss 0->1 (max: 10)}]
Warning FailedMount 100s (x5 over 107s) kubelet MountVolume.MountDevice failed for volume "pv-azurefile" : kubernetes.io/csi: attacher.MountDevice failed to create newCsiDriverClient: driver name file.csi.azure.com not found in the list of registered CSI drivers
Normal Pulling 90s kubelet Pulling image "openquake/engine:nightly"
Normal Pulled 53s kubelet Successfully pulled image "openquake/engine:nightly" in 36.837485563s
Normal Created 22s kubelet Created container rome32
Normal Started 22s kubelet Started container rome32
But If I open a shell into the pod I can see the volume and also write on it
./kubectl --kubeconfig kubeconfig exec -it rome32-bd4f54ddd-fb9g4 -- bash
openquake@rome32-bd4f54ddd-fb9g4:~$ df
Filesystem 1K-blocks Used Available Use% Mounted on
overlay 129900528 27553624 102330520 22% /
tmpfs 65536 0 65536 0% /dev
tmpfs 66008460 0 66008460 0% /sys/fs/cgroup
********.file.core.windows.net:/********* 1073741824 0 1073741824 0% /opt/openquake
/dev/sda1 129900528 27553624 102330520 22% /etc/hosts
shm 65536 0 65536 0% /dev/shm
tmpfs 121476892 12 121476880 1% /run/secrets/kubernetes.io/serviceaccount
tmpfs 66008460 0 66008460 0% /proc/acpi
tmpfs 66008460 0 66008460 0% /proc/scsi
tmpfs 66008460 0 66008460 0% /sys/firmware
$ ls -lrt
total 0
openquake@rome32-bd4f54ddd-fb9g4:/opt/openquake$ sudo touch pippo
openquake@rome32-bd4f54ddd-fb9g4:/opt/openquake$ sudo rm pippo
openquake@rome32-bd4f54ddd-fb9g4:/opt/openquake$ sudo mkdir azurefile
openquake@rome32-bd4f54ddd-fb9g4:/opt/openquake$ cd azurefile/
openquake@rome32-bd4f54ddd-fb9g4:/opt/openquake/azurefile$ sudo touch newfile
openquake@rome32-bd4f54ddd-fb9g4:/opt/openquake/azurefile$ ls -lrt
total 0
-rw-r--r-- 1 root root 0 Jan 25 10:06 newfile
I can see also the folder and files from one VM that mount the same NFS share.
Is your feature request related to a problem?/Why is this needed
Describe the solution you'd like in detail
if add the file share folder to the spec.csi.volumeHandle so it becomes volumeHandle: rgname#storageaccountname#filesharename#somefolder I am getting an error like that in the pod:
subPath works indeed, but having that configured in PV is more convenient. in-tree plugin for azurefile provides that if you specify shareName:
share/subfolder
as wellDescribe alternatives you've considered
Additional context