TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
Here is the environment in which _tensorrt_llm-0.14.0.dev2024092401-cp310-cp310-linuxaarch64.whl was built.
Now I've got a problem in that pynvml does not work much on Jetson. So if anyone has a tensorrt_llm profiler.py that's works on Jetson or is more conversant with and could modify it I'll be able to test this. If not I'll try to make a version of profiler.py that works without pynvml.
If you would like a copy of the wheel let me know where to upload.
It would not build when I set --extra-cmake-vars "ENABLE_MULTI_DEVICE=0" because of undefined references to ompimpi and MPI_ when linking to libtensorrt_llm.so.
Then tried with following command line. and computer os configuration.
profiler.py will need to be modified to get tensorrt_llm operational on Jetson.
actual behavior
import tensorrt_llm
Traceback (most recent call last):
File "", line 1, in
File "/home/scott/.local/lib/python3.10/site-packages/tensorrt_llm/init.py", line 35, in
import tensorrt_llm.runtime as runtime
File "/home/scott/.local/lib/python3.10/site-packages/tensorrt_llm/runtime/init.py", line 22, in
from .model_runner import ModelRunner
File "/home/scott/.local/lib/python3.10/site-packages/tensorrt_llm/runtime/model_runner.py", line 30, in
from ..builder import Engine, EngineConfig, get_engine_version
File "/home/scott/.local/lib/python3.10/site-packages/tensorrt_llm/builder.py", line 30, in
from .auto_parallel import auto_parallel
File "/home/scott/.local/lib/python3.10/site-packages/tensorrt_llm/auto_parallel/init.py", line 1, in
from .auto_parallel import auto_parallel
File "/home/scott/.local/lib/python3.10/site-packages/tensorrt_llm/auto_parallel/auto_parallel.py", line 14, in
from .config import AutoParallelConfig
File "/home/scott/.local/lib/python3.10/site-packages/tensorrt_llm/auto_parallel/config.py", line 9, in
from .cluster_info import ClusterInfo, cluster_infos
File "/home/scott/.local/lib/python3.10/site-packages/tensorrt_llm/auto_parallel/cluster_info.py", line 12, in
from tensorrt_llm.profiler import PyNVMLContext, _device_get_memory_info_fn
ImportError: cannot import name '_device_get_memory_info_fn' from 'tensorrt_llm.profiler' (/home/scott/.local/lib/python3.10/site-packages/tensorrt_llm/profiler.py)
additional notes
import pynvml
print(pynvml.version)
Traceback (most recent call last):
File "", line 1, in
AttributeError: module 'pynvml' has no attribute 'version'
System Info
Here is the environment in which _tensorrt_llm-0.14.0.dev2024092401-cp310-cp310-linuxaarch64.whl was built.
Now I've got a problem in that pynvml does not work much on Jetson. So if anyone has a tensorrt_llm profiler.py that's works on Jetson or is more conversant with and could modify it I'll be able to test this. If not I'll try to make a version of profiler.py that works without pynvml.
If you would like a copy of the wheel let me know where to upload.
It would not build when I set --extra-cmake-vars "ENABLE_MULTI_DEVICE=0" because of undefined references to ompimpi and MPI_ when linking to libtensorrt_llm.so.
Then tried with following command line. and computer os configuration.
export LD_LIBRARY_PATH=/usr/lib/aarch64-linux-gnu/openmpi/lib:$LD_LIBRARY_PATH
python3 ./scripts/build_wheel.py --cuda_architectures "native" \ --nccl_root "/usr/lib/aarch64-linux-gnu" \ --extra-cmake-vars "USE_CUDNN=1;USE_CUSPARSELT=1" \ --python_bindings
tensorrt-llm-compilation.txt python_packages.txt
Who can help?
No response
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
successful build_wheel.py on Nvidia Jetson Orin AGX Developer kit
pynvml doesn't work on nvidia jetson
Expected behavior
profiler.py will need to be modified to get tensorrt_llm operational on Jetson.
actual behavior
import tensorrt_llm Traceback (most recent call last): File "", line 1, in
File "/home/scott/.local/lib/python3.10/site-packages/tensorrt_llm/init.py", line 35, in
import tensorrt_llm.runtime as runtime
File "/home/scott/.local/lib/python3.10/site-packages/tensorrt_llm/runtime/init.py", line 22, in
from .model_runner import ModelRunner
File "/home/scott/.local/lib/python3.10/site-packages/tensorrt_llm/runtime/model_runner.py", line 30, in
from ..builder import Engine, EngineConfig, get_engine_version
File "/home/scott/.local/lib/python3.10/site-packages/tensorrt_llm/builder.py", line 30, in
from .auto_parallel import auto_parallel
File "/home/scott/.local/lib/python3.10/site-packages/tensorrt_llm/auto_parallel/init.py", line 1, in
from .auto_parallel import auto_parallel
File "/home/scott/.local/lib/python3.10/site-packages/tensorrt_llm/auto_parallel/auto_parallel.py", line 14, in
from .config import AutoParallelConfig
File "/home/scott/.local/lib/python3.10/site-packages/tensorrt_llm/auto_parallel/config.py", line 9, in
from .cluster_info import ClusterInfo, cluster_infos
File "/home/scott/.local/lib/python3.10/site-packages/tensorrt_llm/auto_parallel/cluster_info.py", line 12, in
from tensorrt_llm.profiler import PyNVMLContext, _device_get_memory_info_fn
ImportError: cannot import name '_device_get_memory_info_fn' from 'tensorrt_llm.profiler' (/home/scott/.local/lib/python3.10/site-packages/tensorrt_llm/profiler.py)
additional notes
import pynvml
print(pynvml.version) Traceback (most recent call last): File "", line 1, in
AttributeError: module 'pynvml' has no attribute 'version'