microsoft / azure-pipelines-agent

Azure Pipelines Agent 🚀
MIT License
1.72k stars 867 forks source link

Intermittent build failures when running multiple agents on single machine #1448

Closed AndrewGretton closed 6 years ago

AndrewGretton commented 6 years ago

Agent version and platform

Agent version 2.129.0, running on Windows Server 2016 v1607, using VSTS (the agent path says "d:\tfsagent" but we have now fully migrated to VSTS)

What's not working?

I have installed 2 agents on a single machine. I am running a CI msbuild script which performs compilation of around 50 .NET projects, using the /m flag, and then runs nunit tests. When only one agent is enabled on the machine, the build runs reliably. When the second agent is enabled, the build on both agents usually fails, if both agents are running a build at the same time.

This same msbuild job has been running under Teamcity for several years, concurrently on the same agent, with no concurrency-related failures.

The failure message is emitted into the log at different places, it's not always the same place. For example, we have observed it happening 1 minute into the build and also 9 minutes into the build. The error reported in the build results is:

2018-03-12T10:08:11.8939703Z ##[error]Exit code -1 returned from process: file name 'C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe', arguments '-NoLogo -Sta -NoProfile -NonInteractive -ExecutionPolicy Unrestricted -Command ". ([scriptblock]::Create('if (!$PSHOME) { $null = Get-Item -LiteralPath ''variable:PSHOME'' } else { Import-Module -Name ([System.IO.Path]::Combine($PSHOME, ''Modules\Microsoft.PowerShell.Management\Microsoft.PowerShell.Management.psd1'')) ; Import-Module -Name ([System.IO.Path]::Combine($PSHOME, ''Modules\Microsoft.PowerShell.Utility\Microsoft.PowerShell.Utility.psd1'')) }')) 2>&1 | ForEach-Object { Write-Verbose $_.Exception.Message -Verbose } ; Import-Module -Name 'D:\tfsagent2\_work\_tasks\MSBuild_c6c4c611-aa2e-4a33-b606-5eaba2196824\1.120.0\ps_modules\VstsTaskSdk\VstsTaskSdk.psd1' -ArgumentList @{ NonInteractive = $true } -ErrorAction Stop ; $VerbosePreference = 'SilentlyContinue' ; $DebugPreference = 'SilentlyContinue' ; Invoke-VstsTaskScript -ScriptBlock ([scriptblock]::

Agent and Worker's diag log

I have pulled what I believe to be the relevant section from the agent's diagnostic log but I can attach the whole thing if need be. I want to note that the AD user which runs builds, is a member of the local administrators group on the build machine.

[2018-03-12 10:08:24Z INFO WorkerCommandManager] Register command extension for area codecoverage
[2018-03-12 10:08:24Z INFO WorkerCommandManager] Register command extension for area results
[2018-03-12 10:08:24Z INFO WorkerCommandManager] Register command extension for area telemetry
[2018-03-12 10:08:24Z INFO PowerShellExeUtil] Generation: '1'
[2018-03-12 10:08:24Z INFO PowerShellExeUtil] Key name 'HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\PowerShell\1', value name 'Install': '1'
[2018-03-12 10:08:24Z INFO PowerShellExeUtil] Key name 'HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\PowerShell\1\PowerShellEngine', value name 'PowerShellVersion': '2.0'
[2018-03-12 10:08:24Z INFO PowerShellExeUtil] Unsupported version. Skipping.
[2018-03-12 10:08:24Z INFO PowerShellExeUtil] Generation: '3'
[2018-03-12 10:08:24Z INFO PowerShellExeUtil] Key name 'HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\PowerShell\3', value name 'Install': '1'
[2018-03-12 10:08:24Z INFO PowerShellExeUtil] Key name 'HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\PowerShell\3\PowerShellEngine', value name 'PowerShellVersion': '5.1.14393.0'
[2018-03-12 10:08:24Z INFO PowerShellExeUtil] Key name 'HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\PowerShell\3\PowerShellEngine', value name 'ApplicationBase': 'C:\Windows\System32\WindowsPowerShell\v1.0'
[2018-03-12 10:08:24Z INFO ProcessInvoker] Starting process:
[2018-03-12 10:08:24Z INFO ProcessInvoker]   File name: 'C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe'
[2018-03-12 10:08:24Z INFO ProcessInvoker]   Arguments: '-NoLogo -Sta -NoProfile -NonInteractive -ExecutionPolicy Unrestricted -Command ". ([scriptblock]::Create('if (!$PSHOME) { $null = Get-Item -LiteralPath ''variable:PSHOME'' } else { Import-Module -Name ([System.IO.Path]::Combine($PSHOME, ''Modules\Microsoft.PowerShell.Management\Microsoft.PowerShell.Management.psd1'')) ; Import-Module -Name ([System.IO.Path]::Combine($PSHOME, ''Modules\Microsoft.PowerShell.Utility\Microsoft.PowerShell.Utility.psd1'')) }')) 2>&1 | ForEach-Object { Write-Verbose $_.Exception.Message -Verbose } ; Import-Module -Name 'D:\tfsagent\_work\_tasks\MSBuild_c6c4c611-aa2e-4a33-b606-5eaba2196824\1.120.0\ps_modules\VstsTaskSdk\VstsTaskSdk.psd1' -ArgumentList @{ NonInteractive = $true } -ErrorAction Stop ; $VerbosePreference = 'SilentlyContinue' ; $DebugPreference = 'SilentlyContinue' ; Invoke-VstsTaskScript -ScriptBlock ([scriptblock]::Create('. ''D:\tfsagent\_work\_tasks\MSBuild_c6c4c611-aa2e-4a33-b606-5eaba2196824\1.120.0\MSBuild.ps1'''))"'
[2018-03-12 10:08:24Z INFO ProcessInvoker]   Working directory: 'D:\tfsagent\_work\_tasks\MSBuild_c6c4c611-aa2e-4a33-b606-5eaba2196824\1.120.0'
[2018-03-12 10:08:24Z INFO ProcessInvoker]   Require exit code zero: 'True'
[2018-03-12 10:08:24Z INFO ProcessInvoker]   Encoding web name:  ; code page: ''
[2018-03-12 10:08:24Z INFO ProcessInvoker]   Force kill process on cancellation: 'False'
[2018-03-12 10:08:24Z INFO ProcessInvoker] Process started with process id 4992, waiting for process exit.
[2018-03-12 10:08:25Z INFO JobServerQueue] Try to upload 1 log files or attachments, success rate: 1/1.
[2018-03-12 10:08:26Z INFO JobServerQueue] Try to append 1 batches web console lines, success rate: 1/1.
[2018-03-12 10:08:26Z INFO JobServerQueue] Try to append 1 batches web console lines, success rate: 1/1.
[2018-03-12 10:08:26Z INFO JobServerQueue] Try to append 1 batches web console lines, success rate: 1/1.
[2018-03-12 10:08:27Z INFO JobServerQueue] Try to append 1 batches web console lines, success rate: 1/1.
[2018-03-12 10:08:27Z INFO JobServerQueue] Try to append 1 batches web console lines, success rate: 1/1.
[2018-03-12 10:08:28Z INFO JobServerQueue] Try to append 1 batches web console lines, success rate: 1/1.
[2018-03-12 10:08:30Z INFO JobServerQueue] Try to append 1 batches web console lines, success rate: 1/1.
[2018-03-12 10:08:30Z INFO JobServerQueue] Try to append 1 batches web console lines, success rate: 1/1.
[2018-03-12 10:08:32Z INFO JobServerQueue] Try to append 1 batches web console lines, success rate: 1/1.
[2018-03-12 10:08:51Z INFO ProcessInvoker] Scan all processes to find relationship between all processes.
[2018-03-12 10:08:51Z INFO ProcessInvoker] Find all child processes of process '4992'.
[2018-03-12 10:08:51Z INFO ProcessInvoker] Kill process '4992'.
[2018-03-12 10:08:51Z ERR  ProcessInvoker] Ignore Win32Exception with NativeErrorCode 5 during Process.Kill().
[2018-03-12 10:08:51Z ERR  ProcessInvoker] System.ComponentModel.Win32Exception (0x80004005): Access is denied
   at System.Diagnostics.Process.Kill()
   at Microsoft.VisualStudio.Services.Agent.ProcessInvoker.WindowsKillProcessTree()
[2018-03-12 10:08:51Z INFO ProcessInvoker] Finished process with exit code -1, and elapsed time 00:00:26.9407013.
[2018-03-12 10:08:51Z ERR  StepsRunner] Caught exception from step: Microsoft.VisualStudio.Services.Agent.ProcessExitCodeException: Exit code -1 returned from process: file name 'C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe', arguments '-NoLogo -Sta -NoProfile -NonInteractive -ExecutionPolicy Unrestricted -Command ". ([scriptblock]::Create('if (!$PSHOME) { $null = Get-Item -LiteralPath ''variable:PSHOME'' } else { Import-Module -Name ([System.IO.Path]::Combine($PSHOME, ''Modules\Microsoft.PowerShell.Management\Microsoft.PowerShell.Management.psd1'')) ; Import-Module -Name ([System.IO.Path]::Combine($PSHOME, ''Modules\Microsoft.PowerShell.Utility\Microsoft.PowerShell.Utility.psd1'')) }')) 2>&1 | ForEach-Object { Write-Verbose $_.Exception.Message -Verbose } ; Import-Module -Name 'D:\tfsagent\_work\_tasks\MSBuild_c6c4c611-aa2e-4a33-b606-5eaba2196824\1.120.0\ps_modules\VstsTaskSdk\VstsTaskSdk.psd1' -ArgumentList @{ NonInteractive = $true } -ErrorAction Stop ; $VerbosePreference = 'SilentlyContinue' ; $DebugPreference = 'SilentlyContinue' ; Invoke-VstsTaskScript -ScriptBlock ([scriptblock]::Create('. ''D:\tfsagent\_work\_tasks\MSBuild_c6c4c611-aa2e-4a33-b606-5eaba2196824\1.120.0\MSBuild.ps1'''))"'.
   at Microsoft.VisualStudio.Services.Agent.ProcessInvoker.<ExecuteAsync>d__19.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.VisualStudio.Services.Agent.Worker.Handlers.PowerShell3Handler.<RunAsync>d__4.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.VisualStudio.Services.Agent.Worker.TaskRunner.<RunAsync>d__24.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.VisualStudio.Services.Agent.Worker.StepsRunner.<RunStepAsync>d__1.MoveNext()
[2018-03-12 10:08:51Z INFO StepsRunner] Step result: Failed
[2018-03-12 10:08:51Z INFO StepsRunner] Update job result with current step result 'Failed'.
[2018-03-12 10:08:51Z INFO StepsRunner] Current state: job state = 'Failed'
[2018-03-12 10:08:51Z INFO StepsRunner] Processing step: DisplayName='Publish Test Results', ContinueOnError=False, Enabled=True
[2018-03-12 10:08:51Z INFO ExpressionManager] Evaluating: succeededOrFailed()
[2018-03-12 10:08:51Z INFO ExpressionManager] Expanded: True
[2018-03-12 10:08:51Z INFO ExpressionManager] Result: True

My question

How can I fix this so I can reliably run multiple agents on the same machine?

Thank you!

TingluoHuang commented 6 years ago

@AndrewGretton please upgrade to latest version agent, this is a known issue that we fixed. https://github.com/Microsoft/vsts-agent/pull/1407 https://github.com/Microsoft/vsts-agent/releases/tag/v2.131.0

AndrewGretton commented 6 years ago

Thanks @TingluoHuang, this has resolved the issue.