-
### System Info
- X86_64
- RAM: 30 GB
- GPU: A10G, VRAM: 23GB
- Lib: Tensorrt-LLM v0.9.0
- Container Used: nvcr.io/nvidia/tritonserver:24.05-trtllm-python-py3
- Model used: Mistral 7B
### …
-
### Describe the issue
I tried build with CUDA 12.5 and TensorRT 10.0 in Windows, and saw errors like `error C4996: 'nvinfer1::IPluginV2': was declared deprecated` in build.
### Urgency
None
### T…
-
### Search before asking
- [X] I have searched the YOLOv8 [issues](https://github.com/ultralytics/ultralytics/issues) and found no similar feature requests.
### Description
Hey I found out that YO…
-
### Checklist
- [X] I've read the [contribution guidelines](https://github.com/autowarefoundation/autoware/blob/main/CONTRIBUTING.md).
- [X] I've searched other issues and no duplicate issues were…
-
Hi,
I'm having issue when trying to convert starcoder2-3b with smoothquant to trtllm.
I'm running on a100-40gi.
This is my commad:
`python tensorrt_llm/examples/gpt/convert_checkpoint.py --mod…
-
Hi, I am a little stuck on how to use TensorRT to speed up GroundingDINO inference. GroundingDINO takes in both an image and text prompt and I am a bit lost on how to convert the text prompt to tensor…
-
### System Info
-CPU architecture: amd64
-Operating System: Windows 11
-Python version: 3.11.5
-TensorRT-LLM version: 0.10.0
-CUDA version: 12.5
-torch version: 2.2.0+cu121
### Who can help?
_…
-
https://github.com/NVIDIA/TensorRT-LLM/blob/9691e12bce7ae1c126c435a049eb516eb119486c/tensorrt_llm/hlapi/tokenizer.py#L63
-
I'm having trouble converting yolov9-e-converted.pt to a TensorRT model using export.py.
I've tested this on Windows 10, 11, and Ubuntu 22.04, and I'm using cuda12.4.1 and tensorrt 10.0.1.
I've enco…
-
I think I remember somewhere you were looking into supporting tensortRT models, is that still in the backlog somewhere? or would implementing support for tensortRT require too much rework of the exist…