triton-server Search Results

1000+ results
for triton-server

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

NVIDIA-AI-IOT/tao-toolkit-triton-apps #10

could not find plugin: BatchTilePlugin_TRT for SSD model

Hi, I generated the model.plan engine file on the same server as Triton. I also built TensorRT OSS but I get the following error when loading the engine: ``` E0226 17:02:00.421746 1 logging.cc:43…

parichehrvn updated 2 years ago
5
dotnet/machinelearning #6212

Add TensorRT support

**Is your feature request related to a problem? Please describe.** Currently the fastest way of executing models for Computer Vision inference is by running a TensorRT-optimised model. It is widely a…

turowicz updated 2 years ago
6
triton-inference-server/tensorrtllm_backend #112

text_output in /generate_stream does not contain any spaces

While serving code_llama model and requesting `/generate_stream` with `stream: true`, `text_output` field in response does not contain any spaces (`" "`). Is this the expected behavior (i.e. users hav…

anxietymonger updated 10 months ago
8
fauxpilot/fauxpilot #83

Feature: batched inference

This is supported by Triton, we just need to add support for it to the proxy. I have written code to do this independently here: https://moyix.net/~moyix/batch_codegen_full.py ; I just need to integra…

moyix updated 1 year ago
4
triton-inference-server/server #6747

The model got an issue with the OpenVINO Backend.

**Description** Hi all, I have an IR model. I was trying to deploy it on Triton server `v23.10`. However, it encountered this error. ``` Warning: '--strict-model-config' has been deprecated…

chiehpower updated 8 months ago
3
triton-inference-server/dali_backend #127

How to use C++ to call resnet50_trt generated "model. dali "…

Hi! I have build TensorRT via ONNX, and can you tell me how to use C++ to call resnet50_trt generated "model. dali "to preprocess data?

wangshaobobetter updated 2 years ago
3
ELS-RD/transformer-deploy #120

ValueError: Unrecognized configuration class <class 'transfo…

Hi - I am trying to accelerate some T5 models and I get this error. How do I fix this? Command to reproduce: `convert_model -m "valhalla/t5-small-qa-qg-hl" --backend tensorrt onnx --seq-len 16 1…

harishprabhala updated 2 years ago
3
triton-inference-server/server #6051

Unstable Memory after upgrading from Triton 2.27 / nvidia 22…

**Description** Upgrading from 22.10, ORT models are consuming significantly more memory and running VRAM OOM. **Triton Information** What version of Triton are you using? Upgraded to Triton 2.35.…

jturner116 updated 1 year ago
4
NVIDIA/FasterTransformer #594

CUDA memory is not released after inference

## Branch/Tag/Commit Based on V5.3, and merged https://github.com/NVIDIA/FasterTransformer/commit/e2dd1641880840db76b8902b34106c85b026a0af to solve early_stop ## Docker Image Version Refer to fas…

DayDayupupupup updated 1 year ago
5
triton-inference-server/server #5294

How to serve n identical models (except for their weights) w…

To begin, I would like to thank the triton inference server team ! You provide us with a very convenient tool to deploy deep learning models :) **Is your feature request related to a problem? Plea…

julienripoche updated 1 year ago
3

上一页 1...87 88 89 90 91 92 93...100 下一页

1000+ results for triton-server

1000+ results
for triton-server