InftyAI / llmaz

☸️ Easy, advanced inference platform for large language models on Kubernetes
Apache License 2.0
23 stars 10 forks source link

Liveness & Readiness support #21

Open kerthcet opened 3 months ago

kerthcet commented 3 months ago

Add the support for inference services.

kerthcet commented 3 months ago

/kind feature /milestone v0.1.0

pacoxu commented 1 month ago

Also StartupProbe? See https://github.com/triton-inference-server/server/pull/5257/.

kerthcet commented 1 month ago

Yes, something like that, the core reason here is we should be aware of the server condition, ready or not? Maybe this can be part of the backendRuntime because it's related to the backend themselves.