Azure / kubernetes-volume-drivers

Kubernetes volume drivers for Azure
MIT License
126 stars 55 forks source link

df -h shows no mount for blobfuse container after kubernetes cluster upgrade to 1.19.7 in AKS #101

Open amitm20692 opened 3 years ago

amitm20692 commented 3 years ago

I recently upgraded my kubernetes cluster to 1.19.7 and upgraded blobfuse version to 1.17 using the( kubectl apply -f https://raw.githubusercontent.com/Azure/kubernetes-volume-drivers/master/flexvolume/blobfuse/deployment/blobfuse-flexvol-installer-1.9.yaml)

When I tired to check the pods where I am trying to mount it using df-h command, I could not see the mounted directory. driver-logs.txt

andyzhangx commented 3 years ago

last log shows it's success:

Wed Feb 17 18:50:18 UTC 2021 EXEC: mkdir -p /var/lib/kubelet/pods/20acf4dd-7d6b-440e-b822-3e192f0d375f/volumes/azure~blobfuse/my-volume
Wed Feb 17 18:50:18 UTC 2021 INF: AZURE_STORAGE_ACCESS_KEY is set 
Wed Feb 17 18:50:18 UTC 2021 INF: export storage account - export AZURE_STORAGE_ACCOUNT=mystorageaccount 
Wed Feb 17 18:50:18 UTC 2021 EXEC: blobfuse /var/lib/kubelet/pods/20acf4dd-7d6b-440e-b822-3e192f0d375f/volumes/azure~blobfuse/my-volume --container-name=mycontainer --tmp-path=/tmp/fileautomation/ -o allow_other  
Wed Feb 17 18:50:18 UTC 2021 INFO: {"status": "Success"}

could you ssh to the agent node, and then run mount | grep blobfuse to check again? thanks.

amitm20692 commented 3 years ago

last log shows it's success:

Wed Feb 17 18:50:18 UTC 2021 EXEC: mkdir -p /var/lib/kubelet/pods/20acf4dd-7d6b-440e-b822-3e192f0d375f/volumes/azure~blobfuse/my-volume
Wed Feb 17 18:50:18 UTC 2021 INF: AZURE_STORAGE_ACCESS_KEY is set 
Wed Feb 17 18:50:18 UTC 2021 INF: export storage account - export AZURE_STORAGE_ACCOUNT=mystorageaccount 
Wed Feb 17 18:50:18 UTC 2021 EXEC: blobfuse /var/lib/kubelet/pods/20acf4dd-7d6b-440e-b822-3e192f0d375f/volumes/azure~blobfuse/my-volume --container-name=mycontainer --tmp-path=/tmp/fileautomation/ -o allow_other  
Wed Feb 17 18:50:18 UTC 2021 INFO: {"status": "Success"}

could you ssh to the agent node, and then run `mount | grep blobfuse to check again? thanks.

No output. Actually this is the surprising part. The driver log shows everything successful but still the mount doesn't work. I have had this issue on 2 different clusters actually. What I have done so far is delete the blobfuse daemonset, resinstall it and then scale up/scale down the pods which require the mount. After few attempts it worked on the previous cluster. Now, I can't go with this strategy on production. syslog-blobfuse.txt

andyzhangx commented 3 years ago

all blobfuse mount are using same tmp-path, there could be conflict: -tmp-path=/tmp/fileautomation/

Could you remove tmp-path parameter and try again, it will use timestamp in tmp-path

amitm20692 commented 3 years ago

all blobfuse mount are using same tmp-path, there could be conflict: -tmp-path=/tmp/fileautomation/

Could you remove tmp-path parameter and try again, it will use timestamp in tmp-path

If I don't use -tmp-path, the driver uses /mnt/blobfuse as default

andyzhangx commented 3 years ago

I have updated the doc, by default it should be /mnt/blobfuse{random-num}

amitm20692 commented 3 years ago

I have updated the doc, by default it should be /mnt/blobfuse{random-num}

it's /mnt/blobfuse only. What outputs/logs would you need to verify it?

andyzhangx commented 3 years ago

is it possible to use https://github.com/kubernetes-sigs/blob-csi-driver instead?

amitm20692 commented 3 years ago

is it possible to use https://github.com/kubernetes-sigs/blob-csi-driver instead?

We want to, but at this point, so close to major release, we can't switch it without proper testing.

guoweis-work commented 3 years ago

I'm seeing the same thing here. I did /usr/bin/blobfuse /tmp/abc --container-name=0000test --tmp-path=/tmp/blobfuse -o allow_other -o ro --file-cache-timeout-in-seconds=120 --use-https=true and it says everything succeeded. But I don't see it in the mount list

andyzhangx commented 3 years ago

I'm seeing the same thing here. I did /usr/bin/blobfuse /tmp/abc --container-name=0000test --tmp-path=/tmp/blobfuse -o allow_other -o ro --file-cache-timeout-in-seconds=120 --use-https=true and it says everything succeeded. But I don't see it in the mount list

@guoweis-outreach then it's blobfuse mount issue, try add --log-level=LOG_DEBUG to get more info, and get /var/log/syslog log

guoweis-work commented 3 years ago

My blobfuse version is blobfuse 1.0.3

/usr/bin/blobfuse /tmp/abc --container-name=0000test --tmp-path=/tmp/blobfuse1 -o allow_other -o ro --file-cache-timeout-in-seconds=120 --use-https=true --log-level=LOG_DEBUG

gives no output at all. (I can run this command multiple times without unmounting first)

Dumping /var/log/syslog, I only see these 4 lines

Apr 30 15:56:40 aks-linux-39717610-vmss000001 blobfuse[6960]: Function ensure_files_directory_exists_in_cache, in file /home/amnguye/Desktop/azure-storage-fuse/blobfuse/utilities.cpp, line 244: Making cache directory /tmp.
Apr 30 15:56:40 aks-linux-39717610-vmss000001 blobfuse[6960]: Function ensure_files_directory_exists_in_cache, in file /home/amnguye/Desktop/azure-storage-fuse/blobfuse/utilities.cpp, line 244: Making cache directory /tmp/blobfuse1.
Apr 30 15:56:40 aks-linux-39717610-vmss000001 blobfuse[6960]: Function ensure_files_directory_exists_in_cache, in file /home/amnguye/Desktop/azure-storage-fuse/blobfuse/utilities.cpp, line 244: Making cache directory /tmp/blobfuse1/root.
Apr 30 15:56:40 aks-linux-39717610-vmss000001 blobfuse[6963]: Function azs_destroy, in file /home/amnguye/Desktop/azure-storage-fuse/blobfuse/utilities.cpp, line 523: azs_destroy called.

My flex driver

begin to install blobfuse FlexVolume driver 1.0.18, target dir:/etc/kubernetes/volumeplugins ...

My k8s version

Client Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.0", GitCommit:"cb303e613a121a29364f75cc67d3d580833a7479", GitTreeState:"clean", BuildDate:"2021-04-08T16:31:21Z", GoVersion:"go1.16.1", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.5", GitCommit:"54684493f8139456e5d2f963b23cb5003c4d8055", GitTreeState:"clean", BuildDate:"2021-03-22T23:02:59Z", GoVersion:"go1.15.8", Compiler:"gc", Platform:"linux/amd64"}
guoweis-work commented 3 years ago

added -d option and it provided more output

root@aks-linux-39717610-vmss000001:/home/azureuser# /usr/bin/blobfuse  /tmp/abc -d --container-name=0000test --tmp-path=/tmp/blobfuse2 -o allow_other -o ro --file-cache-timeout-in-seconds=120 --use-https=true --log-level=LOG_DEBUG
FUSE library version: 2.9.7
nullpath_ok: 0
nopath: 0
utime_omit_ok: 0
unique: 2, opcode: INIT (26), nodeid: 0, insize: 56, pid: 0
INIT: 7.31
flags=0x03fffffb
max_readahead=0x00020000
   INIT: 7.19
   flags=0x00000031
   max_readahead=0x00400000
   max_write=0x00400000
   max_background=128
   congestion_threshold=96
   unique: 2, success, outsize: 40
fuse: reading device: Invalid argument
andyzhangx commented 3 years ago

could be related to this issue: blobfuse mount with invalid credentials still returns succeed

anyway, I think your credentials of blobfuse is wrong