Closed csb1582 closed 5 years ago
Hi,
I did some quick research and the error message being described suggests that there is a lingering connection in the windows nodes. And since the same connections are being attempted it would fail like that.
Would it be possible to get the output of "net use" from the windows nodes?
If that is the case then clearing those mounted shares would allow them to be mounted again.
I will look into it in the mean time. Thanks for reporting!
net use shows no connections
PS C:\Windows\system32> net use New connections will not be remembered.
There are no entries in the list.
what doesn't make sense is that I can manually mount the share using the kube creds with net use. when kubelet tries it fails
tried running kubelet with --enable-controller-attach-detach=false but this doesn't seem to have an effect
what version of the plugins are you running?
latest release. tested on latest master as well
Can you double check that you have this pr already? https://github.com/microsoft/K8s-Storage-Plugins/commit/cedefc61141dc5941afb7f7a344d9e8059385458#diff-5ea950c7d8402ddc4290315af36bbb08
yes, the PR has been applied to smb.ps1
update-
ran command get-smbglobalmapping
PS C:\Windows\system32> Get-SmbGlobalMapping Status Local Path Remote Path ------ ---------- -----------
OK \1.2.3.4\kubevols\test
OK \FQDN_OF_SERVER\kubevols\test
OK \NETBIOS_OF_SERVER\kubevols
then ran get-smbglobalmapping | remove-smbglobalmapping
and the volume mapped correctly.
current output of get-smbglobalmapping Status Local Path Remote Path ------ ---------- ----------- OK \NETBIOS_OF_SERVER\kubevols\test OK \1.2.3.4\kubevols\test
Notice how the third path, which was top level, disappeared. could this be the reason?
I am glad you managed to recover your volumes.
I am assuming that that was the reason. However I will follow up and see if there is anything else we can add to prevent this from happening in the future.
i think I figured out why this happened. It seems that if volumes are configured as followed this error will occur-
volume 1 - \\server\share <- BAD volume 2 - \\server\share\dir1 volume 3 - \\server\share\dir2
If configured this way it works-
volume 1 - \\server\share\dir1 volume 2 - \\server\share\dir2 volume 3 - \\server\share\dir3
this is probably more of an issue with the way SMB authentication works.
in my case a volume was originally configured at the top level and then deleted. somehow the SMB mapping stuck and then all volumes failed to mount. after running command get-smbglobalmapping | remove-smbglobalmapping all worked again.
i'm not sure why the original volume stuck. it may be an unrelated issue. it might be a good idea to remove the SMB mappings in a shutdown script, but that's outside the scope of this ticket
thanks for the help and quick response. hope this helps someone
Server 2019. Kubernetes 1.14.3. After applying Windows updates and rebooting Windows nodes, the SMB volumes will not remount.
kubectl describe pod- MountVolume.SetUp failed for volume "xyz" : mount command failed, status: Failure, reason: Caught exception Multiple connections to a server or shared resource by the same user, using more than one user name, are not allowed. Disconnect all previous connections to the server or shared resource and try again. with stack
kubelet log- E0617 16:59:58.368344 4408 driver-call.go:274] mount command failed, status: Failure, reason: Caught exception Multiple connections to a server or shared resource by the same user, using more than one user name, are not allowed. Disconnect all previous connections to the server or shared resource and try again. with stack E0617 16:59:58.398345 4408 nestedpendingoperations.go:267] Operation for "\"flexvolume-microsoft.com/smb.cmd/784c0a82-9142-11e9-8e91-0050569e2770-data\" (\"784c0a82-9142-11e9-8e91-0050569e2770\")" failed. No retries permitted until 2019-06-17 16:59:58.8983454 -0400 EDT m=+183.786315501 (durationBeforeRetry 500ms). Error: "MountVolume.SetUp failed for volume \"data\" (UniqueName: \"flexvolume-microsoft.com/smb.cmd/784c0a82-9142-11e9-8e91-0050569e2770-data\") pod \"xyz-1560805020-bbvzg\" (UID: \"784c0a82-9142-11e9-8e91-0050569e2770\") : mount command failed, status: Failure, reason: Caught exception Multiple connections to a server or shared resource by the same user, using more than one user name, are not allowed. Disconnect all previous connections to the server or shared resource and try again. with stack "
SMB server is an EMC SAN array configured as SMB NAS server. Joined to AD domain. All was working well prior to host reboot. No errors seen on the EMC end.
Troubleshooting steps taken- Revert Windows updates Revert Kubernetes version -> 1.14.2 -> 1.14.1 -> back to 1.14.3 Delete all files/folders under \var\lib\kubelet\pods\pods Change user Delete / recreate deployment Tested mounting share as drive with SMB secret creds. This worked when mounting from Windows Explorer, but same share with same creds don't work when using plugin
Tested using latest master and release versions