triton-inference-server Search Results

1000+ results
for triton-inference-server

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

huggingface/optimum-nvidia #69

Triton Inference Server

I would like to use this as a python backend within `triton-inference-server` in order to allow for bringing my production parameters in better alignment with training / validation. Are there plans…

TheCodeWrangler updated 4 months ago
2
NVIDIA/TensorRT-LLM #1852

Running LoRA inference with inflight batching is much slower…

### System Info GPU Name: NVIDIA A800 TensorRT-LLM: 0.10.0 Nvidia Driver: 535.129.03 OS: Ubuntu 22.04 triton-inference-server backend：tensorrtllm_backend ### Who can help? _No response_ ### I…

limertang updated 1 week ago
1
triton-inference-server/server #7365

Handling Unsupported Input and Ensuring GPU Processing in Tr…

I have configured an ensemble model in Triton Inference Server, which includes DALI preprocessing and TensorRT inference. When I uploaded a GIF image from the client, the Triton server crashed with th…

Bycqg updated 2 weeks ago
1
NVIDIA/TensorRT-LLM #1812

Qwen1.5 Model 'tensorrt_llm' loading failed with error: key …

### System Info - 20.04 Ubuntu - NVIDIA H800 - CUDA version 11.8 ### Who can help? @kaiyux @byshiue ### Information - [ ] The official example scripts - [X] My own modified scripts ### Tasks …

HowardChenRV updated 5 days ago
7
triton-inference-server/tensorrtllm_backend #458

two seemingly identical functions in the same file

there are two `gen_random_start_ids` in tools/utils/utils.py https://github.com/triton-inference-server/tensorrtllm_backend/blob/ae52bce3ed8ecea468a16483e0dacd3d156ae4fe/tools/utils/utils.py#L238-L…

dongluw updated 1 month ago
1
ROCm/AMDMIGraphX #2411

MIGraphX execution provider for Triton Inference Server

Can this be done by leveraging the onnxruntime work we already have as a back end? As a preliminary step, learn to add a Cuda back end, then change it to MIGraphX/ROCm See [https://github.com…

bpickrel updated 3 weeks ago
64
triton-inference-server/server #7244

Feature Questions

Since jetson supports triton inference server, I am considering applying it. So, I have a few questions. 1. In an environment where multiple AI models are run in Jetson, is there any advantage to …

cha-noong updated 1 month ago
1
fishaudio/fish-speech #354

[BUG]Can't open inference server

**Describe the bug** Can't open inference server. **To Reproduce** 1. Run install_env.bat with USE_MIRROR=false and INSTALL_TYPE=stable 2. Change API_FLAGS.txt and enable "--infer", then Run sta…

limingchina updated 2 days ago
3
triton-inference-server/model_analyzer #901

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf8 in …

When I used model-analyzer, I got "UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf8 in position 0: invalid start byte". I have the same problem with the latest tag:24.05-py3-sdk. Why do I …

kyosukegg updated 2 weeks ago
5
triton-inference-server/tensorrtllm_backend #475

[Question] Best practises to track inputs and predictions?

Hello, I am seeking advice on the best practices for tracking all inputs and predictions made by a model when using Triton Inference Server. Specifically, I would like to track every interaction th…

FernandoDorado updated 1 month ago
2

上一页 1...1 2 3 4 5 6 7...100 下一页

1000+ results for triton-inference-server

1000+ results
for triton-inference-server