tensorrt-inference Search Results

WATonomous/wato_monorepo #151

Batched inference, tensorRT

Need to use TensorRT, something like https://github.com/noahmr/yolov5-tensorrt for yolov8 ➝ https://github.com/triple-Mu/YOLOv8-TensorRT/blob/main/infer-det.py will be in C++ also look at https://gi…

Kishore-Yogaraj updated 5 days ago

ultralytics/ultralytics #17690

DLA inference time longer than tensorrt execution

### Search before asking - [X] I have searched the Ultralytics YOLO [issues](https://github.com/ultralytics/ultralytics/issues) and [discussions](https://github.com/ultralytics/ultralytics/discussi…

MesaMitch updated 1 week ago

kscalelabs/kinfer #2

Support quantization-aware training and quantized inference

Implement quantization-aware training (QAT) and quantized inference for Jetson. **References** - [Pytorch QAT Blog Post](https://pytorch.org/blog/quantization-aware-training/) - [Lil'Log Blog Post](…

codekansas updated 5 days ago

NVIDIA/TensorRT-LLM #2423

[Question] Can I build the tritonserver, tensorrtllm_backend…

I want to deploy triton + tensorrtllm, due to some constraints I cannot use docker container. I have figured out that I need to build the following repos: 1. https://github.com/triton-inference-server…

chrisreese-if updated 1 week ago

NVIDIA/TensorRT #4181

Issue with TensorRT inference run

void FeatureExtraction::doInference_run(float* inputBuffer, float* outputBuffer) { cudaMemcpyAsync(buffers[inputIndex], inputBuffer, inputStreamSize * sizeof(float), cudaMemcpyHostToDevice, c…

smrutiranjanmohapatra updated 1 month ago

NVIDIA/TensorRT-LLM #2381

CUDA Out of Memory Error when Running Nemotron-51B with Tens…

## Environment - **GPUs**: 4x NVIDIA A100 (80GB) (nvlink. azure Standard_NC96ads_A100_v4) - **TensorRT-LLM Version**: 0.15.0.dev2024102200 - **Environment**: Docker container - **Memory Usage per GPU…

ShivamSphn updated 2 weeks ago

NVIDIA/TensorRT #4248

Non-Maximal-Suppression (NMS) Layers slow on TensorRT 10.0-1…

## Description NMS Layers are much slower on TensorRT than on PyTorch (44% of the performance) and I'm looking for any possible workaround. This seems to be acknowledged as a known issue in the Tenso…

darrin-willis updated 1 week ago

facebookresearch/sam2 #434

Questions about the TensorRT deployment of the video inferen…

![sam2 drawio](https://github.com/user-attachments/assets/d394623f-efd3-4c77-901d-b0f0938c9325) I'm currently trying to deploy a video inference model for SAM2 using TensorRT+cpp. Following his ide…

liyihao76 updated 2 weeks ago

NVIDIA/TensorRT-LLM #2338

Whisper Encoder issues with Executor API

Hello, `0.15.0.dev2024101500` added a new issue when using the executor API with whisper ``` [TensorRT-LLM][ERROR] IExecutionContext::inferShapes: Error Code 7: Internal Error (WhisperEncoder/__add_…

MahmoudAshraf97 updated 6 days ago

NVIDIA/TensorRT-LLM #2284

ModelRunnerCpp throws UnboundLocalError: local variable 'voc…

### System Info TensorRT-LLM v0.13.0 ### Who can help? _No response_ ### Information - [ ] The official example scripts - [ ] My own modified scripts ### Tasks - [ ] An officially supported tas…

jxchenus updated 4 weeks ago

1000+ results for tensorrt-inference

1000+ results
for tensorrt-inference