Open jorgensigvardsson opened 5 years ago
Can you capture a dump when this happens? Do you see the dotnet run process running without stopping when this happens.
I only had bash, and three dotnet processes. I don't have a ps ef
output right now, but can of course provide it on Monday.
The dotnet processes were all sleeping, as if they were waiting for something. Also, IIRC, the processes were grand parent, parent and child. I believe dotnet run
was the child, but I am not 100% sure about that.
The project I run is just a simple console app that don't read from stdin, it only grabs some CLR metadata that it writes to a file.
I did try to do a dotnet publish
to generate a binary to execute instead, but it hung as well.
I just tried to reproduce the error in my own docker host, but I cannot reproduce the hanging dotnet run
process. I can't get the access I need against the docker host in Azure DevOps, so I'm a bit clueless/powerless now.
We have the same problem. We noticed that the task actually completes successfully after 15 minutes.
The DefaultNodeConnectionTimeout is 900 seconds -- possibly related?
@ladipro /@rainersigwald is there any debugging information in MSBuild that could help diagnose if this is a node connection issue?
Setting MSBUILDNODECONNECTIONTIMEOUT="30000"
in the environment does indeed reduce the waiting time, and the task finishes successfully after 30 seconds instead of 15 minutes.
@MartinKarlgrenIMI, with MSBUILDDEBUGCOMM
set to 1
, MSBuild will be dumping node communication log to files named MSBuild_CommTrace_PID_*.txt
in the temp directory. Would it be possible to share these logs from a problematic build?
@ladipro, sure, files below. (This was a build with a 30000 ms timeout, I noticed that in the *_1794.txt file the timeout is hit for one thread.)
MSBuild_CommTrace_PID_1794.txt MSBuild_CommTrace_PID_1767.txt MSBuild_CommTrace_PID_1709.txt MSBuild_CommTrace_PID_1670.txt
It looks like ToolTask
doesn't receive the Process.Exited
notification if the tool process is dotnet build
/ dotnet run
which creates a new OOP node process. Or rather, it receives it only after the node process has exited.
Likely the same root cause as https://github.com/dotnet/sdk/issues/9452. Could be specific to AzDO environment.
@MartinKarlgrenIMI can you please try passing the --init
flag per the last couple of comments in https://github.com/dotnet/runtime/issues/27115 ?
@ladipro unfortunately dotnet run --init --project abc
didn't fix the 15-minute hang in my case.
@ladipro unfortunately
dotnet run --init --project abc
didn't fix the 15-minute hang in my case.
@SeijiSuenaga my understanding is that --init
should be passed to docker, not dotnet. See https://docs.docker.com/engine/reference/commandline/container_run/#init
@ladipro Ah, sorry. Just tried that as well, but it still hung for 15 minutes. (In my case, the hangs are happening in GitLab CI, so I tested it by enabling their FF_USE_INIT_WITH_DOCKER_EXECUTOR
feature flag.)
That said, I did find a workaround for my particular scenario. In case it helps anyone else, I found that my MSBuild target was only hanging when executing as part of dotnet test
, not dotnet build
for the same project. So I adjusted my CI script to run dotnet build
first, then dotnet test
, and now it runs completely normally. 🤔
Steps to reproduce
Create a project, add a target that is invoked before target
BeforeBuild
. That task should then invoke<Exec Command="dotnet run -c $(Configuration) -p ../OtherProject" />
.Expected behavior
I expect the command to run and finish, so that msbuild can continue executing targets.
Actual behavior
Msbuild seemingly hangs, as if it cannot determine that
OtherProject
has exited. This only occurs in Linux, and only sometimes. It always hangs when I run the same task on the Ubuntu 1604 hosted agent in Azure DevOps. It sometimes hangs when I run the same task in Docker on my Windows desktop machine.Environment data
I am using the docker image
microsoft/dotnet:2.2-sdk
as a base for my own image. I have stripped it down to a bare minimum with/bin/bash
as ENTRYPOINT, so that I have been able to run the commands manually.dotnet --info
output:The "offending" target looks like this in my .csproj:
The project
Tracy.Core.Dal.ModelBuilderGenerator
is a custom project that generates code at runtime for other projects to consume. In the logs I can see all the output from the generator project. The very last output is right beforereturn 0;
.The workaround I have now is to tag the target with
Condition="'$(BuildingInsideVisualStudio)' == 'true'"
so that it'll work as expected during development time. During build, I publish the tool in my Docker file to an executable, which I run before the initialdotnet
invocation.Source code access
Access to source code etc can be arranged privately if needed.