dapr / dapr

Dapr is a portable, event-driven, runtime for building distributed applications across cloud and edge.
https://dapr.io
Apache License 2.0
23.84k stars 1.88k forks source link

Actor service failure detection by Placement Service Needs to be configurable #7955

Open rahulpoddar-fyndna opened 1 month ago

rahulpoddar-fyndna commented 1 month ago

The frequency at which placement service detects failure of the pod hosting actor services needs to be minimum so that failures of http request are minimized. If an actor service receives multiple concurrent requests and if the actor service fails during this time, the placement service detects this failure and instantiates the actors on one of the existing pods. As of now the placement services polls/detects in every 2 seconds. If we have this frequency of detecting actor pod configurable, then we can tune minimise the http requests failures. The retry resiliency can be configured to keep it close to polling interval of placement service.

elena-kolevska commented 1 month ago

/assign

cicoyle commented 21 hours ago

docs are needed for this too plz

elena-kolevska commented 20 hours ago

I think we're not documenting the placement cli arguments in docs, only daprd. We do document the values in both the helm values readme and the cli itself.