bug: [WARNING] [api_server:llm-llama-service:3] Timed out waiting for runner to be ready

Describe the bug

I have launched a BentomlServer with a vllm backend on k8s.

Once the model is loaded (codellama 13B instruct in float 16), the logs of the pod are the following :

[INFO] [cli] Starting production HTTP BentoServer from "_service:svc" listening [WARNING] [api_server:llm-llama-service:1] Timed out waiting for runner to be ready

I don't understand why this warning appears in the logs informing me that the runner is no longer ready, why the previous line tells me that the server is ready to listen.

Do you have any explanation why this warning pops up after the logs inform that the server is ready to listen?

Many thanks for your help

To reproduce

No response

Logs

No response

Environment

K8S Python 3.10

System information (Optional)

No response

bentoml / OpenLLM