Open afscrome opened 1 week ago
I'm sympathetic to this request as it caught me out as I was writing the first set of unit tests. The solution is actually to use a cancellation token on the StartAsync method.
I don't think it should. We're adding start/stop to the dashboard to help mitigate forever running issues. IMO there shouldn't be a default timeout
Maybe we need the ability to set a default timeout, but there's none by defuault.
I'm happy with the for the App Host project itself - as you said there you have the dashboard to investigate what is hanging as well as manually start/stop/restart. (+ any other UI that comes with WaitFor
and health checks)
I still think there should be a sanity timeout with DistributedApplicationTestingBuilder
(e.g. 5 mins). Whilst it is possible to set a timeout through a custom cancellation token on DistributedApplicationTestingBuilder.StartAsync
, that does require you to first realise that you're test is going to timeout. And I'd expect if your application hangs due to something with WaitOn
in a test, it's most likely accidental. Even worse is if there's something different with CI so everything passes locally, but fails in Github Actions / Azure Pipelines / whatever tool where you have limited debug options and can't reproduce locally.
That said, if there is an opt-in default timeout, DistributedApplicationTestingBuilder
could make use of that to implement such a sanity timeout.
The sanity timeout would need to be set high enough that it gives images enough time to pull over bad network connections, although if it's only configured by default on DistributedApplicationTestingBuilder
, then it's may not be the end of the world if it occasionally fails on the edge case you have a poor internet connection, no cached image and are pulling several large new images
Is there an existing issue for this?
Is your feature request related to a problem? Please describe the problem.
I left a broken app running for over 20 minutes, and the resource state remained
Waiting
. Should this have timed out by now and gone intoFailedToStart
?Even worse is when the same thing happens during an integration test - the
DistributedApplicationTestingBuilder.BuildAsync().StartAsync()
method just hangs, with little other information available through the debugger (and even less if the test is running in CI).Describe the solution you'd like
Whilst I'm less fussed about the hang when the dashboard is accessible as you can find the information you need there (somewhat related to #5544), hanging tests with no diagnostic data is far more painful.
For the test scenario, I'd like to see a default (but overridable) timeout either on individual
WaitFor
calls, or a timeout on the wholeDistributedApplicationTestingBuilder.BuildAsync().StartAsync()
. Whilst a user could apply their own timeout by addingWaitAsync()
, the test will then fail with a genericThe operation has timed out
error, which doesn't give you much to go and troubleshoot. Tiemout support built in to aspire could fail with a better error message, that will be captured by test runners & CI systems e.g.Additional context
There is some overlap with #5632, however this issue is a more specifically related to
WaitFor
.Also want to stress that despite the deluge of issues I've been logging around
WaitFor
, I do love what has been implemented so far. Just logging things I'm discovering as I play around this evening.