-
[ 3%] Linking CXX shared library ../out/libnvinfer_plugin.so
/usr/bin/ld: cannot find -lcudadevrt
/usr/bin/ld: cannot find -lcudart_static
collect2: error: ld returned 1 exit status
plugin/CMakeFiles…
-
Hi, I am a little stuck on how to use TensorRT to speed up GroundingDINO inference. GroundingDINO takes in both an image and text prompt and I am a bit lost on how to convert the text prompt to tensor…
-
### System Info
Hi,
I'm having trouble reproducing NVidia claimed numbers in the table here: https://nvidia.github.io/TensorRT-LLM/performance/perf-overview.html#throughput-measurements
System Im…
-
[stdbuf-1] [NVINFER LOG]: 1: [runtime.cpp::parsePlan::314] Error Code 1: Serialization (Serialization assertion plan->header.magicTag == rt::kPLAN_MAGIC_TAG failed.)
[stdbuf-1] Failed to deserialize …
-
### System Info
Driver Version: 535.154.05 CUDA Version: 12.5
NVIDIA A100-PCIE-40GB x 8
tensorrt 10.2.0
tensorrt_llm 0.12.0.dev2024072301
triton 2.3.1
…
-
Is it possible to add to https://nvidia.github.io/TensorRT-LLM/ the code copy widget that you already have on https://nvidia.github.io/TensorRT-Model-Optimizer/?
For example if you go to https://nvidi…
-
### System Info
AWS p5 (4 x 80GB H100 GPUs)
TensorRT-LLM v0.11.0
### Who can help?
@byshiue @Tracin
### Information
- [X] The official example scripts
- [ ] My own modified scripts
…
-
**Is your feature request related to a problem? Please describe.**
I am aware that PyTriton already have an example for using PyTriton with tensorrt_llm. But I noticed that the example only support s…
-
### 🐛 Describe the bug
Torch-TensorRT model successfully compiles and can be saved as ExportedProgram. But it fails to load.
Here is the full error log
```py
WARNING:py.warnings:/home/dperi/…
-
Greetings, thank you for your work.
After successful running the nvidia default variant, I tried to run the TensorRT variant (`default-nvidia-tensorrtllm`). But it does not start.
No modifications t…