-
### System Info
- CPU: x86_64 (Ubuntu 20.04.6 LTS)
- GPU: H100 * 8
- CUDA: 12.5.1
- TensorRT-LLM: The latest dev commit, 385626572df16175dd327fa785e4434cb7866a64
- TensorRT: 10.6.0
- Python: 3.10.14
…
-
My [version](https://github.com/NVIDIA/TensorRT-LLM/tree/31ac30e928a2db795799fdcab6be446bfa3a3998)
[Assertion](https://github.com/NVIDIA/TensorRT-LLM/blob/31ac30e928a2db795799fdcab6be446bfa3a3998/cpp…
-
### System Info
Tensorrt-llm v0.14.0
### Who can help?
_No response_
### Information
- [ ] The official example scripts
- [ ] My own modified scripts
### Tasks
- [ ] An officially supported tas…
-
### System Info
Both RTX 2070 and RTX A6000
### Reproduction
I'm using the latest main (535c9cc6730f5ac999e4b1cb621402b58138f819)
I'm using the `make wheel` image, from main.
I built the 3B model…
-
Hi 您好,我根据您的代码,对 whisper-large-v3-turbo 这个模型进行编译部署,报错如下,我看 24.09-trtllm-python-py3 支持的 tensorrt-llm 是0.13.0.您那边测试是成功的吗?
```
Traceback (most recent call last):
File "/workspace/TensorRT-LLM/exam…
-
# trtllm-bench --model models/Llama-2-7b-hf throughput --dataset experiments/synthetic_128_128.txt --engine_dir models/Llama2-7b-trt-engine
[TensorRT-LLM] TensorRT-LLM version: 0.15.0.dev2024111200
…
-
### System Info
- CPU: x86_64, Intel(R) Xeon(R) Platinum 8470
- CPU/Host memory size: 1TB
- GPU:
4xH100 96GB
- Libraries
TensorRT-LLM: main, 0.15.0 (commit: b7868dd1bd1186840e3755b97ea3d3a73dd…
-
After installation, getting error while importing tensorrt_llm:
`ImportError: /opt/conda/lib/python3.10/site-packages/tensorrt_llm/bindings.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN3c106…
-
[version](https://github.com/NVIDIA/TensorRT-LLM/tree/31ac30e928a2db795799fdcab6be446bfa3a3998)
When I build model with paged_context_fmha = true and max_num_tokens = 4096, chunked context is enabled…
-
### System Info
System:
- CPU Architecture: x86_64
- GPU: NVIDIA H100 - 80GB - CUDA 12.4
- TensorRT-LLM: main branch, commit 535c9cc6730f5ac999e4b1cb621402b58138f819
- Operating System: Ubuntu 22.04…