tensorrt-inference Search Results

1000+ results
for tensorrt-inference

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

NVIDIA/TensorRT-LLM #2449

TensorRT-LLM for Whisper: AttributeError: 'PretrainedConfig'…

I followed the exact instructions provided by TensorRT-LLM to setup triton-llm server for whisper I am stuck with the following error when i try to build the TRT: ``` [TensorRT-LLM] TensorRT-LLM ve…

DeekshithaDPrakash updated 1 week ago
6
IRCVLab/Depth-Anything-for-Jetson-Orin #1

Slow TensorRT Inference Speed on Jetson Orin NX

Thank you for your excellent work! :satisfied: :satisfied: :satisfied: Recently, I have been trying to use TensorRT to accelerate Depth Anything on Jetson Orin NX. However, I found that the infere…

zzzzzyh111 updated 9 hours ago
3
NVIDIA/TensorRT-LLM #2440

Inference RoBERTa on Triton server using TRT_LLM

### **I am trying to Deploy and inference the XLM_Roberta model on TRT-LLM.** I followed the example guide for BERT and built the engine: (https://github.com/NVIDIA/TensorRT-LLM/tree/main/examples/be…

DeekshithaDPrakash updated 5 days ago
5
NVIDIA/TensorRT-LLM #2445

Build Qwen2-72B-Instruct model by INT4-AWQ quantization fail…

### System Info Ubuntu 20.04 NVIDIA A100 nvcr.io/nvidia/tritonserver:24.10-trtllm-python-py3 and 24.07 TensorRT-LLM v0.14.0 and v0.11.0 ### Who can help? @Tracin ### Information - [x] The offici…

wangpeilin updated 4 days ago
2
microsoft/onnxruntime #21457

TensorRT EP's inference results are abnormal.

### Describe the issue Inference results are outputting abnormally when using YOLOv7 models with TensorRT EP. We have confirmed that the results are normal when using CPU and CUDA. The issue wa…

c1aude updated 2 weeks ago
37
NVIDIA/TensorRT-LLM #2442

Assertion failed: noRepeatNgramSize.value() > 0

### System Info GPU-A100, TensorRT-LLM version = tensorrt_llm-0.13.0.dev2024090300 Ubuntu machine. ### Who can help? hi @ncomly-nvidia , @byshiue , I want to set the 'no_repeat_ngram_size'=0…

krishnanpooja updated 2 weeks ago
1
microsoft/onnxruntime #22662

symbolic_shape_infer.py script not working for some models

### Describe the issue According to [TensorRT EP docs](https://onnxruntime.ai/docs/execution-providers/TensorRT-ExecutionProvider.html) one should do symbolic shape inference before executing the mod…

maaft updated 3 weeks ago
4
NVIDIA/TensorRT-LLM #2419

Assertion failed: Must set crossKvCacheFraction for encoder-…

### System Info GPU: `A10` Base Image: `FROM nvidia/cuda:12.1.0-runtime-ubuntu22.04` Tensorrt-llm: - `0.12.0` : It's working, but I can't use it because of a version mismatch in TRT and trt-llm-back…

Saeedmatt3r updated 3 weeks ago
2
FlagOpen/FlagPerf #754

stable_diffusion_v1_4使用torchtrt推理时报错

ERROR: [Torch-TensorRT] - Unsupported operator: aten::to.dtype_layout(Tensor(a) self, *, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None, bool non_blocking=Fals…

MaltoseFlower updated 2 months ago
2
microsoft/onnxruntime #22960

[Build] Cuda Execution Provider library is needed despite we…

### Describe the issue There must be a way to build onnxruntime with tensorRt without the cuda execution provider and its cuda unused dependencies. libonnxruntime_providers_cuda.so is big (220MB) and…

jcdatin updated 12 hours ago
5

上一页 1...1 2 3 4 5 6 7...100 下一页

1000+ results for tensorrt-inference

1000+ results
for tensorrt-inference