Closed Novak-Peter closed 5 years ago
Another suspicion: we use the tasks in parallel copy mode - is it possible that in one thread the actual Remove-PSDrive -Name WFCPSDrive code in the folder checking somehow conflicts with another thread in the task as all drives attached with the same name (WCFPSDrive)?
Update: it is failing with the same error with Copy Parallel: false, having multiple target servers, on the very first server...
@NPeete Thanks for reporting the issue. We are looking into the issue, will update you soon.
+1 I am experiencing the same problem. There is no apparent reason to the failure, but I had the same suspicion as NPeete. Can confirm the parallel copy set to off does not fix it. A redeploy or two, and the release succeeds.
I am also having the same issue by using the copy files from Task.
I am also seeing the same error:
Copy started for - IP.IP.IP.IP Failed to Create PSDrive with Destination: '\IP.IP.IP.IP\C$\UAT_Drop\api', ErrorMessage: 'The network path was not found' The network path was not found
@NPeete What is the powershell version on the agent box?
According to the Release Agents / Capabilities tab, the Powershell version is 5.1.14409.1012
My team has the same issue. I tried adding a PowerShell step before the file copy to just list out the results of 'Get-PSDrive' but it doesn't show anything unusual:
2018-07-17T23:13:08.7037788Z Name Used (GB) Free (GB) Provider Root CurrentLocation
2018-07-17T23:13:08.7045006Z ---- --------- --------- -------- ---- ---------------
2018-07-17T23:13:08.7117423Z A FileSystem A:\
2018-07-17T23:13:08.7134095Z Alias Alias
2018-07-17T23:13:08.7531918Z C 592.85 430.15 FileSystem C:\ BuildAgent_work\r103\a
2018-07-17T23:13:08.7541380Z Cert Certificate \
2018-07-17T23:13:08.7831237Z D 1.31 14.69 FileSystem D:\
2018-07-17T23:13:08.7841311Z Env Environment
2018-07-17T23:13:08.7851526Z Function Function
2018-07-17T23:13:08.7861713Z HKCU Registry HKEY_CURRENT_USER
2018-07-17T23:13:08.7871375Z HKLM Registry HKEY_LOCAL_MACHINE
2018-07-17T23:13:08.7881077Z Variable Variable
2018-07-17T23:13:08.7890868Z WSMan WSMan
And then the File Copy step fails with: Failed to Create PSDrive with Destination: '\[MachineName]\f$\deploy\10.0.18194.007\Deployment\Tools', ErrorMessage: 'Multiple connections to a server or shared resource by the same user, using more than one user name, are not allowed. Disconnect all previous connections to the server or shared resource and try again'
We have recognized the issue. Will be deploying a fix soon.
Hi @NPeete , can you try to run the 'net use' on the agent machine and see if you can see any shares connected to your target machines ? Can you try removing the share and then use the task. Also, can you share the $PSVersionTable info of the agent machine and the OS info and can you tell if any one of the following situations is true for your case:
You have 2 instances of the task (either as part of the same build/release flow or different build/release ) running on the same agent machine and trying to copy files to the same target but with different credentials.
Did you recently change the username being used for copying ?
Unfortunately we don't have access to release agents (hence I can't show you the full $PSVersionTable info), but when I originally reported the issue we were validating with ops both net use
and Get-PSDrive
commands and none of them were showing extra records.
We use the same agents for both QA and PROD deployments with different credentials - QA and PROD target machines are always different. We always used the same credentials for QA (hardcoded TFS deploy user), and different ones for PROD (ops users, their credentials entered in release variables per release) - so it may be possible only in PROD cases that 2 instances of the task were running on the same agent with different credentials targeting the same machines BUT: 1) We were double checking if anything else was running on that agent and there wasn't 2) There are some different agents which are used only for QA deployments (with always the same credentials) and the issue occurred there as well.
@NPeete , are you running the 'net use' command in a session running with the same identity with which the agent is running ? can you try creating a new release and running a powershell task on the failing agent and execute net use. You can also use the task to extract psversiontable info and systeminfo. Also, will it be possible for you to reboot the agent machine and then try executing the task ? You can share the info at RM_Customer_Queries at microsoft.com
I was able to fix my issue by just adding a 'net use /y /delete *' step before the file copy tasks. Turns out there were a ton of disconnected net use mappings from previous release attempts.
Great find! Shouldn't this be done by the task for the specific connection being created?
@SumiranAgg Could you please let me know when the fix will be available in VSTS? It is lot of manual work for us to add net use command task for disposing disconnected mappings on all release definitions before each file copy task!
Added PR #7757 for the fix.
Hi @SumiranAgg , Thanks! How could we make our existing "windows machine file copy task" in VSTS use this new robocopy script?
@FlorenceDaniel The next deployment will happen sometime next week, after which all accounts will have the updated task. If you want the task before we deploy to your account, you'll have to use tfs-cli to manually upload the task to your account. If you do plan to manually upload and need guidance with tfs-cli, you can mail us at RM_Customer_Queries [at] microsoft [dot] com.
@rajatagrawal-dev Thanks a lot for you response. Would you be able to update this thread once the deployment happens next week? Also right now we are using 1. version of the "windows machine file copy task". Should we update our release definitions to use 2. version of the "windows machine file copy task" in order to have the updated task? Please let me know.
@FlorenceDaniel I'll update the thread once the task is deployed to all accounts. Yes, the fix is only in version 2 of the task. Please select version 2 to get the task with latest features and bug fixes.
Please check if you have Windows File Copy v2.1.2. The fix should be deployed.
We are able to select "2.*" as Version. Will this change on release definition tasks pull your latest update that fixes this issue?
When you queue a release, the latest available patch of the version that you have selected (2.) will be taken. So please select 2. in your release definition and in the logs, you will find the exact task version that is available for major version 2.
I can see we are using version 2.1.2 of this task, but we still get this error when using the Copy Files in Parralel. I can't see any sessions open from the agent.
Hi @Mibe8, Were you able to figure out the issue? Your response might help us in saving lot of time. We don't want to update the version for all copy file tasks in release definitions and then end up with same issue again. Please let me know.
Hi @rajatagrawal-dev , Any response to Mibe8's comment above will be appreciated.
Thanks! Florence
@Mibe8 Can you run the command suggested by @maknud (net use /y /delete *) once through a powershell task on your agent box? This should remove any stale mappings which might be interfering. Then try the file copy task which is now patched to remove the psdrive mappings after each execution.
@FlorenceDaniel Can you also try to do the same? You can try with just one of your release definitions or create a new one to mimic your scenario.
@FlorenceDaniel, no I didn't figure it out yet.
@rajatagrawal-dev, I tried the net use command. It returns: There are no entries in the list.
I'm able to reproduce quite easy, by creating a build and new release definition. All I do is copy a zip file to a machine.
The first time it works. At the end of the log I see:
If I start a new release, I get:
Some extra info: Script stack trace:
@Mibe8 That's unexpected. I tried the same scenario that you specified but it worked for me. Can you try restarting the agent machine? Also, please share you account name, project name and the release definition name. I'll check a few things on our end as well. Please send it to raagra@microsoft.com, if you don't want to post these details here.
@FlorenceDaniel Please let me know if you are facing similar issue still.
Hi,
We are not seeing the issue after upgrading to version 2.* of Windows Machine File Copy Task.
Thanks for the help!
Regards, Florence
hi @rajatagrawal-dev - i am facing the same issue and using Version 2.* for Windows Machine File Copy. I can share the Account NAme, Project Name and release definition if this helps. Please let me know if there you were able to resolve this.
Thanks, Anurag
We are having the same problem on our account.
I've been troubleshooting the issue and have some information that might help. In our release we have three WindowsMachineFileCopy tasks in a row. The first two succeed and the third fails nine times out of ten. We are running our own build server running the latest version of the agent - 2.136.1 We are running Windows Machine File Copy 2.3.1
When I log onto our build server and run a command prompt as NETWORK SERVICE (our agent is using this account) when the file copy task starts I see the share get created when I type net use. I can see the share. When the file copy task stops and I type net use I can see that it has been removed. When the next file copy task occurs and I type net use I can see it again. So it looks like it is adding and removing the share properly.
For us at least it is looking like a timing issue. Maybe the next copy task tries to create the share before the previous one has fully removed it?
My workaround for the moment is to shift the third copy task until a little bit later in the pipeline. This seems to have resolved the issue for me.
Hope this info helps.
Further update. I put some sleeps into my process with mixed success. It seems like this is an intermittent issue for me at least. Sometimes it works, sometimes it doesn't but the failure rate is fairly high.
Hi @rajatagrawal-dev ,
We faced the issue again last week even after upgrading to version 2.* of Windows Machine File Copy Task.
Please let us know why this issue is observed. Thanks! Florence
Hi,
We are still having this issue intermittently, even after upgrading to use 2. of the Windows Machine File Copy Task. We had the issue previously with 1. and did notice that changing it to 2.* allowed us to complete the releases (without rebooting the agents). At the time, it was picking up version 2.1.3 of the task.
However now it's picking up 2.1.4, and we are having the issue again. Has something regressed in the code? Is there another way we can get around this issue in our release definitions? I've seen posts above where people have added a preceding task for 'net use /y /delete *', is this a safe thing to do on the release agents as part of a release?
Thanks
Craig
We were also running into this issue. We have a build server that also functions as a host server for our lower environments, so the release for a few of our pipelines will be done on that server to that same server. We were getting the above failure (Multiple connections ...) when the Windows Machine File Copy task was not using the same user that the VSTS agent was running as. After updating the task to use the same user as the agent, the step succeeded.
Hi Guys, thanks for reporting this. We are looking into this issue. We added Remove-PSDrive at the end of the task in order to delete connection that we make. But I see that it does not work in some situations. We are trying to see if in addition to Remove-PSDrive , we can delete shared paths using net use as part of the task itself.
Hi, are there any updates to this? We have this same problem and it is super annoying...
I resolved my issue by inserting a "net use" command line script before my copy task, to enumerate any connections in the log. I found that there was a connection to \
@mcollins2002 IPC$ is a build-in administrative share that cannot be modified. When you execute the command, does it actually delete this share ? Or does it just reset the path associated with this location ? Also, can you tell what was the value of the connection to this share ? Was it's value the target path that you provide as part of the task input ? And can you check and tell if you are able to run the task successfully after executing this command only once. In other words, I want to know if in your case it is absolutely necessary to run this command in order to run the task successfully or would running it only once suffice ?
Hi guys, i have raised a PR to use the net use /delete
path to delete the shared path created as part of new-psdrive command. But I am unable to repro this issue. Can someone who is facing this issue pick these changes and see if it resolves the problem. here is the pr: https://github.com/Microsoft/vsts-tasks/pull/8575 . ( You only need to take the changes in the .ps1 file )
@arjgupta The cmd doesn't delete the share, it just deletes the agent's mapping to the share. It appears that after the cmd is run once I can remove that cmd from the release and it executes as it should. Perhaps something didn't get cleaned up properly during a failed release attempt.
For anyone with this issue: If you have multiple agents running, restart them (the service I mean).
We had an issue where one of the services was not letting go, once we restarted the build agents, the builds started working again.
This is because you may be using different agents with different service accounts. Try using IP Address, this should resolve this. Another way to resolve is to use same service account for all the agents. However this may not be possible for all scenarios. Using IP address to access the share will surely help resolve the below error: There is an old KB from MS also talks about this: https://support.microsoft.com/en-in/help/938120/error-message-when-you-use-user-credentials-to-connect-to-a-network-sh
"Multiple connections to a server or shared resource by the same user, using more than one user name, are not allowed. Disconnect all previous connections to the server or shared resource and try again"
Restarting the windows service agent is running on, solved the issue for us.
Great it worked, yes this will also clear up all the connections.
Restart the windows services was enough to solve my problem..
This issue just started on a single Test VM. System error 1219 when copying powershell scripts. I Added a "run" task just prior to the file copy task using the aforementioned net use /delete and that solved the issue. We do have this in a task group so all other VMs will benefit. Thank you to whoever suggested adding the net /use /y /delete
I am facing the same error, but I am not sure if also other people face the same problem as i do.
Filecopy tasks perform the following important tasks.
In my case I am copying multiple files in the first file transfer task (quite a lot of DLLs)
Now the interesting fact is that happens is that windows defender/antimalware service ("C:\Program Files\Windows Defender\MsMpEng.exe") kicks in and starts scanning the files copied remotely and even continues after point 3 finished. Now even that the PSDrive gets disconnected the scanning continues over the UNC path. It would be interesting to know if this scanning is done with the credentials used for mapping the drive or with the NT system account.
Now the error message occurs when the next file copy task starts, but the scanning of the previous one did not finish. In order to see if the issue is the same as mine I recommend you to use procmon (process monitor) and filter after the hostname. If you see the MsMpEng.exe process files on the UNC path during the time the error occurs, the following solution will work for you.
Working solution for my scenario:
Other possible workarounds that I see currently (besides the ones shared before):
I hope this helps also for others.
We have multiple Windows Machine File Copy tasks on different Release Agents (+ and different target machines), the target dirs are specified for folders (like D:\TargetFolder), the target machine lists contain fqdn machine names. Lately we randomly started to get the following error:
Failed to Create PSDrive with Destination: '\\TargetMachineFQDN\D$\TargetFolder', ErrorMessage: 'Multiple connections to a server or shared resource by the same user, using more than one user name, are not allowed. Disconnect all previous connections to the server or shared resource and try again'
Suspicious thing to me: https://github.com/Microsoft/vsts-tasks/blob/ea589633e6eb5c3aec562076a240dd4f2f781b0b/Tasks/WindowsMachineFileCopyV2/RoboCopyJob.ps1#L219-L225
For this there isn't a matching Remove-PSDrive command - can it be the issue that the release agents started to run out of free network shares?