tensorrt-inference Search Results

1000+ results
for tensorrt-inference

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

triton-inference-server/pytriton #79

[Question] Tensor parallelism for tensorrt_llm

**Is your feature request related to a problem? Please describe.** I am aware that PyTriton already have an example for using PyTriton with tensorrt_llm. But I noticed that the example only support s…

JoeLiu996 updated 2 months ago
1
triton-inference-server/server #7150

CUDA Graph not work

**Description** CUDA Graph not work in tensorrt backend. The model config as below: ``` platform: "tensorrt_plan" version_policy: { latest: { num_versions: 2}} parameters { key: "execution_mode"…

SunnyGhj updated 1 month ago
5
Pangoraw/PaDiM #10

can you shared C++ TensorRT inference version？

can you shared C++ TensorRT inference version？

CvBokchoy updated 4 months ago
4
NVIDIA/TensorRT-LLM #2422

attempt to run benchmark with batch_size>=512 and input_outp…

System config: - CPU arch x86_64 - GPU: H200 - Tensorrt-LLM:v0.14.0 - OS: ubuntu-22.04 - runtime-env: docker container build from sources via official [build script](https://techcommunity.microsoft.c…

dmonakhov updated 2 weeks ago
2
NVIDIA/TensorRT-LLM #1642

failed to use "stop_words_list" for tensorrt-llm==0.9.0

i use GenerationExecutorWorker for web service, using the parameters stop_words_list = [["hello, yes"]] by modifying the as_inference_request function in exectutor.py as follow: the ir parameter …

AGI-player updated 2 weeks ago
7
deepjavalibrary/djl-serving #2498

TensorRT-LLM(TRT-LLM) LMI model format artifacts not found w…

## Description (A clear and concise description of what the bug is.) Model artifacts are in the (TRT-LLM) LMI model format: ` aws s3 ls *** PRE 1/ 2024-10-25 14:59:…

joshight updated 1 month ago
1
NVIDIA/TensorRT #4246

The latency difference between bilinear gridsample and neare…

## Environment **TensorRT Version**:8.6.2 **NVIDIA GPU**:Orin **NVIDIA Driver Version**: **CUDA Version**:12.2 **CUDNN Version**: 8904 ## Description I have a onnx model. There are some grids…

DaZhUUU updated 2 weeks ago
6
CarkusL/CenterPoint #16

problem about tensorrt inference

Thanks for your excellent work. In CenterPoint/tensorrt/samples/centerpoint)/README.md, Do i have to install docker and run the step2. (Because i run centerpoint in anaconda) or just need to run s…

XGL-github updated 2 years ago
1
NVIDIA/TensorRT-LLM #2483

QwenVL build failed.

### System Info - CPU architecure : x86_64 - GPU properties - GPU name : 4x L4 setup - GPU memory size : 96GB - Libraries - TensorRT-LLM branch or tag : main - TensorRT version : 0.16…

Wonder-donbury updated 1 week ago
2
warmshao/FasterLivePortrait #100

About performance with onnxruntime

My env: gpu nvidia 4090 system windows cuda 12.4 cudnn 9.1 I migrated onnxruntime code for grid_sample 5D from liqun/imageDecoder_cuda branch to the main branch and compiled. code is here ht…

Ryuukeisyou updated 2 weeks ago
1

上一页 1...5 6 7 8 9 10 11...100 下一页

1000+ results for tensorrt-inference

1000+ results
for tensorrt-inference