Closed jsmith-kno2 closed 1 year ago
@jsmith-kno2 - The important thing to remember with actor registration is that on the dapr runtime side, it is a multi-step process and actors are one of the last things to get setup for the actual runtime.
The chain of events here is something like:
healthz/outbound
truehealthz
trueSo you can see here the issue is two-fold. First, Outbound
health is strictly for component/API availability. It does not mean that Dapr has fully started. The reason it exists is because your app cannot wait for the standard healthz
endpoint while starting as Dapr is waiting for your app to fully start. Outbound
is there to solve the case of "my app needs a secret store to start, can I wait for that".
The 2nd problem is that the placement server is separate from the core runtime. It needs to track all of the actor hosts and handle the balancing of requests. As such, we can't tie it to the sidecar's health.
All of this is to say that if you move to the other health endpoint, you'll certainly be closer to the actual time that your actors have been loaded. However, you'll still either want a wait or a retry inserted in there. If you enabled resiliency for actors, you should be able to set that up, or you can use a .NET library if that's your preference.
Please let me know if you have any questions about this explanation.
This is great information. Thank you for the insight and advice!
@jsmith-kno2 - Do you need anything else? Or can we close this issue?
Overview
When running a dapr enabled app via Docker Compose for the first time, the actor proxy instance returned by
IActorProxyFactory
throws the following exception when calling any method on that actor:This only happens on the initial creation of the Docker containers for my app and the dapr sidecar. Once created, I can stop/restart and everything works fine. However, I can consistently recreate this exception by deleting the Docker Compose stack in Docker Desktop (just the containers, not the images) and re-run the application.
If I step through the code (slowing things down) or add a delay of at least 2 seconds between the creation of the actor proxy instance and invoking the method on the actor, it works fine. I've noticed that if I run a delay of 1 second, it still fails but with a 2 second day, it succeeds and is very close to this log entry:
My question: is there a way to explicitly confirm dapr is ready and healthy before attempting to invoke a method on an actor proxy instance?
Additional Info
The invocation of the actor is to self-register a reminder. I'm using this approach to create a microservice responsible for platform maintenance operations that need to be executed singularly on a specific schedule. I want the microservice to contain the reminder registration mechanics rather than rely on an external event to trigger reminder registration.
To accomplish this, I've created a background service that performs the registration. Perhaps there's a better way to approach this problem and I'd love to hear if that's the case but assuming this is a reasonable approach vector, this feels like a race condition related to ready checks.
I've attempted to solve this by first waiting for the application to fully start as well as a successful health check from the dapr client. While that's addressed consistent exceptions thrown in earlier iterations of this approach, I'm still stuck with this exception above.
Here's the background service implementation with the 2 second delay that will avoid the exception above:
I've also tried using
DaprClient.CheckOutboundHealthAsync
andDaprClient.WaitForSidecarAsync
, both singularly and all together.Also of note, I am using the default dapr placement and redis containers created by
dapr init
for my Docker Compose stack by usinghost.docker.internal
as the hostname in the component configs.Any thoughts on how to address? Is this a failure of my understanding or a race condition in the dapr sdk?