Open EmilioZhao opened 1 week ago
I tried make -C docker build
and successfully built docker, but can not run LLama demo because of the incompatibility of CUDA driver and CUDA 12.5! Commented in the issue: https://github.com/NVIDIA/TensorRT-LLM/issues/601#issuecomment-2339467644
But nobody replies!
System Info
GPU :rtx 3080
I've just follow "installation" guide from quick start step by step https://github.com/NVIDIA/TensorRT-LLM/blob/main/docs/source/installation/linux.md
All the operations were done in the docker which is suggested optional in the guide.
I've finished Check installation python3 -c "import tensorrt_llm" returned correct tensorrt_llm version:
however, when I finished all the steps in installation and start the section "Compile the Model into a TensorRT Engine"
I can not run "make -C docker release_run LOCAL_USER=1" in the docker, because I was in the docker of
nvidia/cuda:12.4.1-devel-ubuntu22.04
already.So I just directly run the following steps to convert llama weights and build.
Converting weights was successful but trtllm_build failed because torch can't find CUDA Device.
I've checked cuda driver running status by
nvidia-smi
, and it returned a normal result with 535.183.06.I found that there're two different versions of CUDA version were installed. Maybe one was already installed in the docker, while the other was installed by pip dependency of tensorrt_llm.
I don't know whether is the problem of docker
nvidia/cuda:12.4.1-devel-ubuntu22.04
or I shoud build another docker on the host bymake -C docker release_run LOCAL_USER=1
Quick start is really import for someone new to tensorrt-llm, It took me several hours to run through this guide, It's irresponsible not to double check all the steps related while upgrade the repository.
Who can help?
The one who wrote quick start.
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
nvidia/cuda:12.4.1-devel-ubuntu22.04
Expected behavior
actual behavior
additional notes
nope