Closed xiejibing closed 1 month ago
Hello @xiejibing,
Thank you for bringing this to our attention. 24.08-vllm-python-py3
does not support ensemble models. Support for ensemble models is currently being added for future releases of the vLLM
container and should tentatively be added in the 24.10 release.
A temporary fix for now could be:
vllm_backend
to the container
Currently, 24.08-py3
supports ensemble models. So, you could add the vLLM
backend to this container by following these instructions and use this container for your ensemble model.--backend=ensemble
to the build arguments to enable ensemble model support in the vLLM
container:Thank you for your suggestion! We will use the base container and install the vllm firstly.
I'll close this issue, since the functionality has been merged and is targeting 24.10
release.
Description "Poll failed for model directory 'ensemble': unexpected platform type 'ensemble' for ensemble"
Triton Information tritonserver:24.08 docker image from https://docs.nvidia.com/deeplearning/triton-inference-server/release-notes/rel-24-08.html#rel-24-08
Are you using the Triton container or did you build it yourself?
No, use officially provided container.
To Reproduce Steps to reproduce the behavior.
Describe the models (framework, inputs, outputs), ideally include the model configuration file (if using an ensemble include the model configuration file for that as well).
models:
config file:
preprocessor
main model
triton server logs
Expected behavior A clear and concise description of what you expected to happen.