-
Opening issue to collect information on whether there is a good reason to add TensorRT as a serving backend.
https://github.com/NVIDIA/TensorRT-LLM/issues/334
-
### Checklist
- [X] I've searched other issues and no duplicate issues were found.
- [X] I'm convinced that this is not my fault but a bug.
- [X] I've read the [contribution guidelines](https://g…
-
---
**Platform:** Ubuntu 20.04
**Version:** uv 0.4.16
---
### Description:
When attempting to install **Torch-TensorRT** using the `uv add` command, I encountered an error. The command I ra…
-
### Search before asking
- [X] I have searched the Ultralytics YOLO [issues](https://github.com/ultralytics/ultralytics/issues) and [discussions](https://github.com/ultralytics/ultralytics/discussi…
-
### Search before asking
- [X] I have searched the Ultralytics YOLO [issues](https://github.com/ultralytics/ultralytics/issues) and found no similar bug report.
### Ultralytics YOLO Component
_No …
-
### System Info
TensorRT-LLM v0.13.0
### Who can help?
_No response_
### Information
- [ ] The official example scripts
- [ ] My own modified scripts
### Tasks
- [ ] An officially supported tas…
-
I tried using your docker image and built my own from scratch. The speed on nvidia L4 is 40ms/frame which is ~25fps (same as plain torch.compile). The demo shows around 60% gpu load. Is there somethin…
-
### Search before asking
- [X] I have searched the Ultralytics YOLO [issues](https://github.com/ultralytics/ultralytics/issues) and [discussions](https://github.com/ultralytics/ultralytics/discussion…
-
-
## Problem Description
When trying to use pipeline parallelism in tensorrt-llm on 2+ NVIDIA GPUs, I encounter ```AssertionError: Expected but not provided tensors:{'transformer.vocab_embedding.weig…