Open ealeykin opened 1 week ago
[Triage] - Can you please provide more info on why you are using two hosts? Can you share more of what is the goal of your setup? It would also be helpful if you can share a minimal repro repository of the issue so we can better investigate.
@joperezr Goal of setup and why two hosts:
this is a basic consumer (1st host) that pulls messages from Kafka and writes to database (2nd host a dotnet app to apply sql schema and create tables in SqlServer)
the issue happens much more frequently when running Aspire app in integration tests with DistributedApplicationTestingBuilder.
Will need more time to create repro repo.
I would suggest using the built in WaitFor
support coming in the next version of Aspire instead of this version that you're showing us in the snippet above.
This code is racy, it assumes that the connection string will be available by the time this code runs which may not be the case. That might be why it works sometimes and breaks at other times.
@davidfowl I'm aware of upcoming changes and also yours previous implementation (and it also stuck sometimes), however:
The connection string is available all the time (and if not - it will just retry, but it's not the case). Eventually the health check succeeds all the time in 100% cases, but DbUp host depending on SqlServer is not starting anyway.
Even if I switch to the new WaitFor
, that approach with EnvironmentCallbackAnnotation
- I like it and would like to use it to pre-create some Kafka topics (for simplicity reasons) - any objections against using it like this ?
The behavior is mostly observed in Integration tests with DistributedApplicationTestingBuilder
.
The only thing comes to my mind - try to test the same approach with NET9. However EnvironmentCallbackAnnotation
this is something fundamental and basic, shouldn't stuck anyway...
Thanks
Even if I switch to the new WaitFor, that approach with EnvironmentCallbackAnnotation - I like it and would like to use it to pre-create some Kafka topics (for simplicity reasons) - any objections against using it like this ?
Yes, you should use the built in feature that we designed and tested for waiting on resources. You would not use this to pre-create kafka topics though, you would use the new eventing system (to avoid abusing the environment variable callback).
The only thing comes to my mind - try to test the same approach with NET9. However EnvironmentCallbackAnnotation this is something fundamental and basic, shouldn't stuck anyway...
If you go with this approach, then you'll need to read the code and understand what might be happening. See where the hang is and try to figure it out. We don't have enough information to help you debug. It's difficult to tell from that code snippet why and what is hanging.
Is there an existing issue for this?
Describe the bug
I have an Aspire app running two hosts and three containers:
The application host is waiting for DbUp completion and also Kafka/RabbitMq health checks, DbUp host waits for SqlServer health check only.
So the issue is with DbUp - it's not started at all sometimes, so the Aspire app is stuck. The output is as following:
Now DbUp supposed to run, SqlServer is ready - but it doesn't. It's not 100% reproducible all the time, just every 4th or 5th run.
Since DbUp depends only on SqlServer health check, I'm posting here custom health check for SqlServer added to DbUp resource (but others are similar).
Expected Behavior
Once SqlServer health check is OK, environment callback annotation completes and DbUp resource transits to running state.
Steps To Reproduce
Aspire app:
Integration test with XUnit:
Exceptions (if any)
Not consistent, sometimes it works, sometimes - no. Every 4th/5th run - it stucks.
.NET Version info
.NET SDK: Version: 8.0.300 Commit: 326f6e68b2 Workload version: 8.0.300-manifests.e0880c5d MSBuild version: 17.10.4+10fbfbf2e
Runtime Environment: OS Name: Mac OS X OS Version: 14.5 OS Platform: Darwin RID: osx-arm64 Base Path: /usr/local/share/dotnet/sdk/8.0.300/
.NET workloads installed: [aspire] Installation Source: SDK 8.0.300 Manifest Version: 8.2.2/8.0.100 Manifest Path: /usr/local/share/dotnet/sdk-manifests/8.0.100/microsoft.net.sdk.aspire/8.2.2/WorkloadManifest.json Install Type: FileBased
Host: Version: 8.0.5 Architecture: arm64 Commit: 087e15321b
.NET SDKs installed: 8.0.300 [/usr/local/share/dotnet/sdk]
.NET runtimes installed: Microsoft.AspNetCore.App 8.0.5 [/usr/local/share/dotnet/shared/Microsoft.AspNetCore.App] Microsoft.NETCore.App 8.0.5 [/usr/local/share/dotnet/shared/Microsoft.NETCore.App]
Other architectures found: None
Environment variables: Not set
global.json file: Not found
Anything else?
No response