Today, one can use the /ping API to know when TorchServe is up. But this is for the frontend only. Workers with multiple models will take additional time to come up.
Alternatives
One can write a script to use the /describe API for each of the models to track when each have at least 1 backend worker and then declare the pod ready.
🚀 The feature
This feature would add an API so that Kubernetes probe can be used to know when to start sending traffic.
/ready
will return 200 when all the models specified inconfig.properties
has at least 1 backend worker ready to receive traffic.This would make it simpler for customers to use TorchServe in a kubernetes deployment with multi model endpoints scenario
Motivation, pitch
For Multi-Model-Endpoint Use-case with Kubernetes, consider
config.properties
has the following modelsToday, one can use the
/ping
API to know when TorchServe is up. But this is for the frontend only. Workers with multiple models will take additional time to come up.Alternatives
One can write a script to use the
/describe
API for each of the models to track when each have at least 1 backend worker and then declare the pod ready.Additional context
No response