Open nmq45698 opened 2 months ago
@nmq45698 still coming soon but getting closer! for now I would just run SmoothQuant inference through PyTorch like their repo shows, or use AWQ TinyChat, or MLC/TVM.
@nmq45698 still coming soon but getting closer! for now I would just run SmoothQuant inference through PyTorch like their repo shows, or use AWQ TinyChat, or MLC/TVM.
I've updated to the last version of tensort_llm https://github.com/NVIDIA/TensorRT-LLM/releases/tag/v0.12.0
@nmq45698 still coming soon but getting closer! for now I would just run SmoothQuant inference through PyTorch like their repo shows, or use AWQ TinyChat, or MLC/TVM.
I've updated to the last version of tensort_llm https://github.com/NVIDIA/TensorRT-LLM/releases/tag/v0.12.0
Thanks for your remind! :P Actually, I've noticed such repo about TensorRT-LLM, with engines successfully built on GPU with amd64.
While for Jetson AGX Orin, I've tried under some docker images pulled from @dusty-nv dustynv/... r35.x.x, which seems not supportive for TensorRT 10.3 due to the GLIBC under Ubuntu 20.04. The kINT64, kBF16 in TensorRT-LLM could not be compiled.
Then, I tried on docker images pulled from dustynv/... r36.x.x (say nano_llm) based on Ubuntu 22.04, the TensorRT 10.3 was successfully installed.
However, another error about libnvinfer.so occurred when running /scripts/build_wheels.py, as well as import tensorrt in python!
(Pdb) import tensorrt *** ImportError: /lib/aarch64-linux-gnu/libnvinfer.so.10: undefined symbol: _ZN5nvdla8IProfile36setCanGenerateDetailedLayerwiseStatsEb
Have you ever encountered that upon building TensorRT-LLM on Orin? ^v^
@nmq45698 still coming soon but getting closer! for now I would just run SmoothQuant inference through PyTorch like their repo shows, or use AWQ TinyChat, or MLC/TVM.
I've updated to the last version of tensort_llm https://github.com/NVIDIA/TensorRT-LLM/releases/tag/v0.12.0
Thanks for your remind! :P Actually, I've noticed such repo about TensorRT-LLM, with engines successfully built on GPU with amd64.
While for Jetson AGX Orin, I've tried under some docker images pulled from @dusty-nv dustynv/... r35.x.x, which seems not supportive for TensorRT 10.3 due to the GLIBC under Ubuntu 20.04. The kINT64, kBF16 in TensorRT-LLM could not be compiled.
Then, I tried on docker images pulled from dustynv/... r36.x.x (say nano_llm) based on Ubuntu 22.04, the TensorRT 10.3 was successfully installed. However, another error about libnvinfer.so occurred when running /scripts/build_wheels.py, as well as import tensorrt in python!
(Pdb) import tensorrt *** ImportError: /lib/aarch64-linux-gnu/libnvinfer.so.10: undefined symbol: _ZN5nvdla8IProfile36setCanGenerateDetailedLayerwiseStatsEb
Have you ever encountered that upon building TensorRT-LLM on Orin? ^v^
Do you use jetpack 5?
HI there! Currently i'm trying to deploy some small models (say gpt-2) with smoothquant on Jetson-AGX Orin. However, there is no results upon searching the image "tensorrt_llm:35.2.1" on dockerhub. I've noticed the your comments on #564 that the trtllm will be available soon. So, are there update for docker image to support the "docker run" about tensorrt_llm on Orin? thx :P