Open mtaron opened 12 months ago
@mtaron Thanks for pointing that out, we already have a fix in the internal codebase, which will be included in the next update to GitHub.
Before that, please use the modification as you shared as a workaround, thanks!
Sorry, the changes are not going to be included in today's update to GitHub. We realized that if the line is modified to FROM base as final
, the latest version of TensorRT is not going to be included in the container, which will lead to an issue when using it. However, if you install TensorRT and PyTorch in the last stage of the dockerfile, the size of the container is not going to be reduced much compared to what we have for now.
We'll keep the dockerfile for a while until we find a better solution, does that make sense to you? @mtaron Thanks again for reporting the issue.
Ah, yeah - you'll always have two versions of TensorRT in the container (as far as image size is concerned) unless you use a base image without it or the base image has the version you want.
Hello,
I noticed that images for
23.10-trtllm-python-py3
are about 10 GB larger than other Triton server images. This is due to a bug in your Dockerfile here: https://github.com/triton-inference-server/tensorrtllm_backend/blob/47b609b670d6bb33a5ff113d98ad8a44d961c5c6/dockerfile/Dockerfile.trt_llm_backend#L51You are accidentally including all the builder stages, defeating the point of having a multi-stage Dockerfile.