Open ZJU-lishuang opened 2 months ago
We have the same issue. 54GB docker container is not great.
I'm also having the same issue
I am also seeing this same issue. @byshiue and/or @schetlur-nv - any updates on why this happens?
Can you try the instructions in https://github.com/triton-inference-server/tensorrtllm_backend?tab=readme-ov-file#option-1-build-via-the-buildpy-script-in-server-repo? We are trying to make the two closer to each other, but right now they differ quite a bit. Using build.py
should result in a smaller image due to fewer dependencies.
@schetlur-nv - when building with the build.py, how do I specify that I want to use the main
branch on the TRT-LLM
repo? All it does is allow me to set a flag: TENSORRTLLM_BACKEND_REPO_TAG=rel
; does rel
mean main
here?
TensorRT-LLM Backend I have built via docker.
But the size of docker image is too big than the image in NGC.
How to decrease the size?
this is the
pip list
.