Closed JamesNK closed 3 months ago
@JamesNK , if this is an intermediate issue, can you please enable diagnostics logs for testplatform & share those with us https://github.com/Microsoft/vstest-docs/blob/master/docs/diagnose.md
I can't easily get logs off the build server but I can repo freezing on my dev machine with this. It freezes all the time within 5 minutes:
while($true) { dotnet test --diag:log.txt }
This is an ongoing problem. I'd like to make some progress on fixing it.
Do you need anymore information from me?
@JamesNK I went through the logs & from the logs I didn't see any hang state. The only interesting thing I observed was that it seemed we start 10 different testhost processes in sequence, but you have shared logs for 30 testhost process. Can you please share how many test dlls are you running ?
Because I'm doing it in a loop: while($true) { dotnet test --diag:log.txt }
In the example of the logs I attached it freezes after 3 runs.
Have you tried reproing it?
I'm looking into it, I've cloned the repo, & all I need to do is run dotnet on sln right?
I tried it locally multiple times, but it did not repro for me.
We are currently experiencing the same issue. We have been using mcr.microsoft.com/dotnet/core/sdk:2.2.204
to avoid the performance degradation issue which is now resolved. But when attempting the latest mcr.microsoft.com/dotnet/core/sdk
, currently b4c25c26dc73f498073fcdb4aefe167793eb3a8c79effa76df768006b5c345b8
, only a couple test runs finish while the rest seem to hang.
As with the performance issue, it seems to be related to non-interactive hosts.
Situation
<IsTestProject>true</IsTestProject>
set, and we are running dotnet test
against the solution in a CircleCI environment.Replication I condensed our project to share an example. Each test project has a single test that sleeps for 5sec. Repo: https://github.com/tasadar2/vstest-issue-2080 CircleCI: https://circleci.com/gh/tasadar2/vstest-issue-2080/3
It doesn't always replicate the issue, but often does.
~Running test projects individually has fixed this problem for us - https://github.com/grpc/grpc-dotnet/commit/152255ec5419c1360819788d7911f4957c8e4e2c~
@tasadar2 @JamesNK sdk:2.2.301 has a fix, can you please check if you are still hitting the issue ?
Still appears to be an issue.
image: mcr.microsoft.com/dotnet/core/sdk@sha256:a50e175acd618c3e90bc91dceb5194e6c3764c5b4d179390cef874a887476ba9
example: https://circleci.com/gh/tasadar2/vstest-issue-2080/7
I've narrowed down my hanging issue. It is caused by something with how vstest writes to the console. If no tests fail then vstest completes without any problems. If a test fails then vstest hangs until the CI build times out (this is in Travis CI)
The workaround I am using is to write the test output to a text file, and the write the text file to console. If I do that then it never hangs.
@JamesNK What is the dotnet sdk version you are using ? Can you share the link to the CI ?
Looks like this is still an issue on the recent 3.0.
image: mcr.microsoft.com/dotnet/core/sdk@sha256:3afea8958440231a77b3daea267951cc8ba9026fc1015bcbccc206d6f1d031f7
example: https://app.circleci.com/jobs/github/tasadar2/vstest-issue-2080/10
@tasadar2 Can you please try to use --logger:console;noprogress=true argument and check the issue reproduces for you ?
That arg produces the same results, https://app.circleci.com/jobs/github/tasadar2/vstest-issue-2080/11
Though, when quoting the argument value, that seems to work --logger:"console;noprogress=true"
https://app.circleci.com/jobs/github/tasadar2/vstest-issue-2080/14
I had the same problem, two workaround worked:
Adding --logger:"console;noprogress=true"
like @tasadar2
However, I did not like it because I could not see the progress (--logger:"console"
has same issue), so instead added < /dev/null
dotnet test .... < /dev/null
This allow me to see the progress of tests, without hanging.
Full logs of a successful build with feedback https://circleci.com/gh/dgarage/NBXplorer/430 before the hack you can see things stalling https://circleci.com/gh/dgarage/NBXplorer/409
Happening on mcr.microsoft.com/dotnet/core/sdk:3.0.100 xUnit.net VSTest Adapter v2.4.1 (64-bit .NET Core 3.0.0)
@NicolasDorier thank you for finding your workaround. I can confirm it works, please see rabbitmq/rabbitmq-dotnet-client#750
Is there a longer term fix for this? I'm running into the same issue with a set of web integration tests.
Ping @mayankbansal018 please remove the label, there is enough information and this issue and workaround have been tested by numerous people. This is a bug that need to be fixed and well identified.
@JamesNK So far it looks like it is caused by setting color at the same time as setting cursor position. https://github.com/microsoft/vstest/issues/2282 Exploring strategies how to avoid that. Probably the best is to write the progress in a way similar to @NicolasDorier comment above. But ideally write dots at the end of the line, so the output is not affected that much.
Experimenting with it here, and also trying to get it to lock up by moving cursor on the vstest-like branch. See the other thread for some more info.
Only to confirm that we get that problem in Concourse if and only if dotnet test
executed on a solution file.
@NicolasDorier Any suggestions for windows powershell?
I don't remember having the issue on powershell.
I had this problem with a small vm (2 cores, 2GB ram), increasing to 8cores and 8gb ram make it work. I read at some place that this was thread related so I came up with the more cores idea. The < /dev/null
workaround didn't help in my case.
Not sure if it's related. I'm also a GitLab user and found a very similar problem so I suspect it's probably related to the SDK image, but in my case happens with Microsoft (R) Test Execution Command Line Tool Version 16.9.1
and the workaround didn't work for me.
https://forum.gitlab.com/t/dotnet-test-hangs-in-gitlab-gitlab-runner-13-9-0-rc2/50977
I have also checked whether it has anything to do with async calls in case there is a deadlock, but I seem to be awaiting properly everywhere.
Out of curiosity, is anybody experiencing this only when using IHostedService
? In my case it seems that's what causing the tests to freeze within a linux image and disabling parallelization avoids the deadlock.
@diegosasw, for me it's happening when using dotnet test on a solution file even if there is no IHostedService usage.
This helped us in case of IHostedService: https://www.strathweb.com/2021/05/the-curious-case-of-asp-net-core-integration-test-deadlock/
@Evangelink @davidfowl Even on our very powerful Macbook pros, when we are running one of our integration test projects that has hundreds of tests, many of which use WebApplicationFactory
we often get into deadlocks.
We set xunit to use unlimited amount of threads (-1), we have overridden a bunch of xunit stuff to implement a semaphore to limit the number of concurrent tests. And yet, often we get into deadlocks after the ~600 tests completed mark. After verifying we have no sync-over-async, we found this beauty in asp.net core code itself...
https://github.com/dotnet/aspnetcore/blob/main/src/Hosting/TestHost/src/TestServer.cs#L101
Can this be the root cause for all our deadlocks only in Unit tests problems?
Finally! I managed to reproduce it in a small test project, where I can do dotnet-dump and attach to process without everything crashing.
I found this fella in one of the threads' callstack.
And going up the callstack we can see it's inside (or trying to get inside and the debugger is misleading) a lock block of NLog.
Can you file this issue on dotnet/aspnet
Your request is my command: https://github.com/dotnet/aspnetcore/issues/43353
just in case this helps someone, at a random time this line started randomly hanging our tests
_task = Task.Factory.StartNew(async () => { await Task.Delay((int)cleanInterval.TotalMilliseconds, _cancellationTokenSource.Token); --> while (!_cancellationTokenSource.Token.IsCancellationRequested)
@cvpoienaru Please investigate this issue.
Did anything happen in the investigation by chance?
I am experiencing the same when using TestServer.
It works well locally, when running in GitLab runner, it hangs.
I've noticed it gets frozen when trying to dispose TestServer because the disposing of IWebHost
hangs.
No exceptions thrown.
I've tried to stop all the IHostedService
in case that's the problem, but still unable to dispose TestServer
to see if that solves the problem.
Could I know how are you troubleshooting this? The only workaround i have is to set
#if DEBUG
[assembly: CollectionBehavior(CollectionBehavior.CollectionPerClass, DisableTestParallelization = false)]
#else
[assembly: CollectionBehavior(CollectionBehavior.CollectionPerClass, DisableTestParallelization = true)]
#endif
in the test assembly, so that it runs sequentially in CI/CD pipeline.
@cvpoienaru Please investigate this issue.
This issue is still unresolved.
This issue is a mix of different problems, some related to other products, but I did not find a clear repro. If someone is still experiencing this problem and has a simple repro, please file a new issue.
Steps to reproduce
Tests run on travis CI via
dotnet test
now intermittently fail. Failures started after updating to a newer .NET Core SDK. It appears that the test tool increased from 16.0.1 to 16.1.1 with the new SDK.Source code: https://github.com/grpc/grpc-dotnet/commits/master
Expected behavior
Tests run and exit
Actual behavior
Tests hang and the build is terminated
Diagnostic logs
Failure: https://travis-ci.org/grpc/grpc-dotnet/builds/551562232?utm_source=github_status&utm_medium=notification
Microsoft (R) Test Execution Command Line Tool Version 16.1.1
Success: https://travis-ci.org/grpc/grpc-dotnet/builds/551058627?utm_source=github_status&utm_medium=notification
Microsoft (R) Test Execution Command Line Tool Version 16.0.1