TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
[TensorRT-LLM] TensorRT-LLM version: 0.15.0.dev2024111200
[11/21/2024-10:26:10] [TRT-LLM] [I] Preparing to run throughput benchmark...
[11/21/2024-10:26:10] [TRT-LLM] [I] Setting up benchmarker and infrastructure.
[11/21/2024-10:26:10] [TRT-LLM] [I] Initializing Throughput Benchmark. [rate=-1 req/s]
[11/21/2024-10:26:10] [TRT-LLM] [I] Ready to start benchmark.
[11/21/2024-10:26:10] [TRT-LLM] [I] Initializing Executor.
[TensorRT-LLM][INFO] Engine version 0.15.0.dev2024111200 found in the config file, assuming engine(s) built by new builder API.
WARNING: A deprecated MPI_Info key was used.
Deprecated key: env
Corrected key: PMIX_ENVAR
We have updated this for you and will proceed. However, this will be treated
as an error in a future release. Please update your application.
trtllm-bench --model models/Llama-2-7b-hf throughput --dataset experiments/synthetic_128_128.txt --engine_dir models/Llama2-7b-trt-engine
[TensorRT-LLM] TensorRT-LLM version: 0.15.0.dev2024111200 [11/21/2024-10:26:10] [TRT-LLM] [I] Preparing to run throughput benchmark... [11/21/2024-10:26:10] [TRT-LLM] [I] Setting up benchmarker and infrastructure. [11/21/2024-10:26:10] [TRT-LLM] [I] Initializing Throughput Benchmark. [rate=-1 req/s] [11/21/2024-10:26:10] [TRT-LLM] [I] Ready to start benchmark. [11/21/2024-10:26:10] [TRT-LLM] [I] Initializing Executor. [TensorRT-LLM][INFO] Engine version 0.15.0.dev2024111200 found in the config file, assuming engine(s) built by new builder API.
WARNING: A deprecated MPI_Info key was used.
Deprecated key: env Corrected key: PMIX_ENVAR
We have updated this for you and will proceed. However, this will be treated as an error in a future release. Please update your application.
[worker:15806] PRTE ERROR: Bad parameter in file base/odls_base_default_fns.c at line 962 [11/21/2024-10:26:11] [TRT-LLM] [I] Benchmark Shutdown called! [11/21/2024-10:26:11] [TRT-LLM] [I] Executor shutdown. Traceback (most recent call last): File "/home/lilin/anaconda3/envs/tensorrt_llm/bin/trtllm-bench", line 8, in
sys.exit(main())
File "/home/lilin/anaconda3/envs/tensorrt_llm/lib/python3.10/site-packages/click/core.py", line 1157, in call
return self.main(args, kwargs)
File "/home/lilin/anaconda3/envs/tensorrt_llm/lib/python3.10/site-packages/click/core.py", line 1078, in main
rv = self.invoke(ctx)
File "/home/lilin/anaconda3/envs/tensorrt_llm/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/lilin/anaconda3/envs/tensorrt_llm/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, ctx.params)
File "/home/lilin/anaconda3/envs/tensorrt_llm/lib/python3.10/site-packages/click/core.py", line 783, in invoke
return __callback(args, *kwargs)
File "/home/lilin/anaconda3/envs/tensorrt_llm/lib/python3.10/site-packages/click/decorators.py", line 45, in new_func
return f(get_current_context().obj, args, *kwargs)
File "/home/lilin/anaconda3/envs/tensorrt_llm/lib/python3.10/site-packages/tensorrt_llm/bench/benchmark/throughput.py", line 182, in throughput_command
benchmark.start_benchmark()
File "/home/lilin/anaconda3/envs/tensorrt_llm/lib/python3.10/site-packages/tensorrt_llm/bench/benchmark/throughput.py", line 350, in start_benchmark
self.executor = ExecutorManager(self.runtime_config,
File "/home/lilin/anaconda3/envs/tensorrt_llm/lib/python3.10/site-packages/tensorrt_llm/bench/benchmark/throughput.py", line 212, in init
self.executor = trtllm.Executor(
RuntimeError: [TensorRT-LLM][ERROR] Assertion failed: mComm != MPI_COMM_NULL (/home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/common/mpiUtils.cpp:450)
1 0x7fc2a43e3387 tensorrt_llm::common::throwRuntimeError(char const, int, std::string const&) + 82
2 0x7fc2a43e3dfb /home/lilin/anaconda3/envs/tensorrt_llm/lib/python3.10/site-packages/tensorrt_llm/libs/libtensorrt_llm.so(+0x786dfb) [0x7fc2a43e3dfb]
3 0x7fc2a66c794f tensorrt_llm::executor::Executor::Impl::initializeOrchestrator(int, int, tensorrt_llm::executor::ExecutorConfig const&, tensorrt_llm::executor::ParallelConfig, tensorrt_llm::executor::ModelType, std::filesystem::path const&) + 463
4 0x7fc2a66c8d7c tensorrt_llm::executor::Executor::Impl::initializeCommAndWorkers(int, int, tensorrt_llm::executor::ExecutorConfig const&, std::optional, std::optional const&, std::optional const&, std::optional const&) + 1164
5 0x7fc2a66cad22 tensorrt_llm::executor::Executor::Impl::Impl(std::filesystem::path const&, std::optional const&, tensorrt_llm::executor::ModelType, tensorrt_llm::executor::ExecutorConfig const&) + 1698
6 0x7fc2a66b4540 tensorrt_llm::executor::Executor::Executor(std::filesystem::path const&, tensorrt_llm::executor::ModelType, tensorrt_llm::executor::ExecutorConfig const&) + 64
7 0x7fc3133f63cd /home/lilin/anaconda3/envs/tensorrt_llm/lib/python3.10/site-packages/tensorrt_llm/bindings.cpython-310-x86_64-linux-gnu.so(+0x1153cd) [0x7fc3133f63cd]
8 0x7fc31336334d /home/lilin/anaconda3/envs/tensorrt_llm/lib/python3.10/site-packages/tensorrt_llm/bindings.cpython-310-x86_64-linux-gnu.so(+0x8234d) [0x7fc31336334d]
9 0x55c5163d9c46 /home/lilin/anaconda3/envs/tensorrt_llm/bin/python3.10(+0x13bc46) [0x55c5163d9c46]
10 0x55c5163d2f73 _PyObject_MakeTpCall + 723
11 0x55c5163e57c6 /home/lilin/anaconda3/envs/tensorrt_llm/bin/python3.10(+0x1477c6) [0x55c5163e57c6]
12 0x55c5163e61f9 PyVectorcall_Call + 201
13 0x55c5163e3534 /home/lilin/anaconda3/envs/tensorrt_llm/bin/python3.10(+0x145534) [0x55c5163e3534]
14 0x55c5163d327b /home/lilin/anaconda3/envs/tensorrt_llm/bin/python3.10(+0x13527b) [0x55c5163d327b]
15 0x7fc313360f0b /home/lilin/anaconda3/envs/tensorrt_llm/lib/python3.10/site-packages/tensorrt_llm/bindings.cpython-310-x86_64-linux-gnu.so(+0x7ff0b) [0x7fc313360f0b]
16 0x55c5163d2f73 _PyObject_MakeTpCall + 723
17 0x55c5163cf5c3 _PyEval_EvalFrameDefault + 22083
18 0x55c5163d2400 _PyObject_FastCallDictTstate + 208
19 0x55c5163e3009 /home/lilin/anaconda3/envs/tensorrt_llm/bin/python3.10(+0x145009) [0x55c5163e3009]
20 0x55c5163d2f8b _PyObject_MakeTpCall + 747
21 0x55c5163cebae _PyEval_EvalFrameDefault + 19502
22 0x55c5163da0cc _PyFunction_Vectorcall + 108
23 0x55c5163ca680 _PyEval_EvalFrameDefault + 1792
24 0x55c5163da0cc _PyFunction_Vectorcall + 108
25 0x55c5163e5e7c PyObject_Call + 188
26 0x55c5163ccce2 _PyEval_EvalFrameDefault + 11618
27 0x55c5163da0cc _PyFunction_Vectorcall + 108
28 0x55c5163e5e7c PyObject_Call + 188
29 0x55c5163ccce2 _PyEval_EvalFrameDefault + 11618
30 0x55c5163e54e2 /home/lilin/anaconda3/envs/tensorrt_llm/bin/python3.10(+0x1474e2) [0x55c5163e54e2]
31 0x55c5163e5e7c PyObject_Call + 188
32 0x55c5163ccce2 _PyEval_EvalFrameDefault + 11618
33 0x55c5163da0cc _PyFunction_Vectorcall + 108
34 0x55c5163ca680 _PyEval_EvalFrameDefault + 1792
35 0x55c5163da0cc _PyFunction_Vectorcall + 108
36 0x55c5163ca680 _PyEval_EvalFrameDefault + 1792
37 0x55c5163e5764 /home/lilin/anaconda3/envs/tensorrt_llm/bin/python3.10(+0x147764) [0x55c5163e5764]
38 0x55c5163ccce2 _PyEval_EvalFrameDefault + 11618
39 0x55c5163d2400 _PyObject_FastCallDictTstate + 208
40 0x55c5163e3ae9 _PyObject_Call_Prepend + 105
41 0x55c5164a3b89 /home/lilin/anaconda3/envs/tensorrt_llm/bin/python3.10(+0x205b89) [0x55c5164a3b89]
42 0x55c5163d2f73 _PyObject_MakeTpCall + 723
43 0x55c5163cebae _PyEval_EvalFrameDefault + 19502
44 0x55c51646a80c /home/lilin/anaconda3/envs/tensorrt_llm/bin/python3.10(+0x1cc80c) [0x55c51646a80c]
45 0x55c51646a757 PyEval_EvalCode + 135
46 0x55c51649ab1a /home/lilin/anaconda3/envs/tensorrt_llm/bin/python3.10(+0x1fcb1a) [0x55c51649ab1a]
47 0x55c516495fa3 /home/lilin/anaconda3/envs/tensorrt_llm/bin/python3.10(+0x1f7fa3) [0x55c516495fa3]
48 0x55c5163352c2 /home/lilin/anaconda3/envs/tensorrt_llm/bin/python3.10(+0x972c2) [0x55c5163352c2]
49 0x55c5164907dd _PyRun_SimpleFileObject + 445
50 0x55c516490374 _PyRun_AnyFileObject + 68
51 0x55c51648d6db Py_RunMain + 795
52 0x55c51645de97 Py_BytesMain + 55
53 0x7fc538fed555 __libc_start_main + 245
54 0x55c51645ddae /home/lilin/anaconda3/envs/tensorrt_llm/bin/python3.10(+0x1bfdae) [0x55c51645ddae]