Open pseudotensor opened 2 months ago
A small change to get alot of reliability back....
We see vLLM crash or hang in various ways, e.g.:
https://github.com/vllm-project/vllm/issues/4108 https://github.com/vllm-project/vllm/issues/4344
And manually managing that is a hassle.
vLLM team could easily add a HEALTHCHECK line in the Dockerfile so tools like autoheal can function.
https://hub.docker.com/r/willfarrell/autoheal/ https://docs.docker.com/reference/dockerfile/#healthcheck
Would looks like:
HEALTHCHECK --interval=5m --timeout=10s curl -f http://localhost/health || exit 1
This would allow one to use the other docker image to manage the vLLM images.
Manual labor
No response
@pseudotensor
Can you open a PR?
🚀 The feature, motivation and pitch
A small change to get alot of reliability back....
We see vLLM crash or hang in various ways, e.g.:
https://github.com/vllm-project/vllm/issues/4108 https://github.com/vllm-project/vllm/issues/4344
And manually managing that is a hassle.
vLLM team could easily add a HEALTHCHECK line in the Dockerfile so tools like autoheal can function.
https://hub.docker.com/r/willfarrell/autoheal/ https://docs.docker.com/reference/dockerfile/#healthcheck
Would looks like:
This would allow one to use the other docker image to manage the vLLM images.
Alternatives
Manual labor
Additional context
No response