triton-inference-server Search Results

1000+ results
for triton-inference-server

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

huggingface/blog #972

How to package Hugging Face into Nvidia Triton Inference Ser…

I was recently deploying hugging face models on the Triton inference server which helped me to increase my GPU utilization and serve multiple models using a single GPU. Was not able to find good r…

nickaggarwal updated 4 months ago
25
triton-inference-server/server #7035

RAM memory growth of triton server, until killed by OS

Im using nvcr.io/nvidia/tritonserver:23.10-py3 container for my inferencing, using C++ GRPC API. There is several models in container, Yolov8-like architecture in Tensorrt plus a few Torchscript model…

InfiniteLife updated 3 months ago
4
triton-inference-server/onnxruntime_backend #245

CPU Throttling when Deploying Triton with ONNX Backend on Ku…

**Description** I am deploying a YOLOv8 model for object-detection using Triton with ONNX backend on Kubernetes. I have experienced significant CPU throttling in the sidecar container ("queue-proxy")…

langong347 updated 1 month ago
6
triton-inference-server/server #7236

Cant build python+onnx+ternsorrtllm backends r24.04

Im trying https://github.com/triton-inference-server/server/blob/main/docs/customization_guide/compose.md to build onnx+python+tensorrtllm backends. 1) as mention in doc i do ```bash git clone …

gulldan updated 2 months ago
3
triton-inference-server/server #7374

Prebuilt Triton Server 24.05-trtllm-python-py3 does not have…

**Description** According to the Framework matrix (https://docs.nvidia.com/deeplearning/frameworks/support-matrix/index.html#framework-matrix-2024), 24.05 is supposed to support TensorRT 10.0.6.1. Th…

CarterYancey updated 1 day ago
6
triton-inference-server/tensorrtllm_backend #463

Can you provide an example of a visual language model or mul…

there is an example https://github.com/NVIDIA/TensorRT-LLM/tree/main/examples/qwenvl , but I have no idea how can I use this model in triton server, Can you provide an example of a visual language mod…

lzcchl updated 1 month ago
6
triton-inference-server/server #5622

triton inference client pinned to geventhttpclient==2.0.2, c…

**Description** PR [185](https://github.com/triton-inference-server/client/pull/185) pinned `geventhttpclient==2.0.2` due to a potential change in ssl_context_factory handling. The geventhttpcli…

brightsparc updated 2 weeks ago
5
h2oai/h2ogpt #223

Update Triton inference server Docker deployment for Falcon …

https://github.com/h2oai/h2ogpt/blob/main/docs/TRITON.md do same for Falcon 7B, then Falcon 40B

arnocandel updated 1 year ago
4
triton-inference-server/server #7182

Is onnxruntime-genai supported?

Hey all, I have a quick question, is onnxruntime-genai ([https://onnxruntime.ai/docs/genai/api/python.html](https://onnxruntime.ai/docs/genai/api/python.html)) supported in Triton Inference Server's O…

jackylu0124 updated 2 months ago
2
triton-inference-server/onnxruntime_backend #251

Is onnxruntime-genai supported?

Hey all, I have a quick question, is onnxruntime-genai ([https://onnxruntime.ai/docs/genai/api/python.html](https://onnxruntime.ai/docs/genai/api/python.html)) supported in Triton Inference Server's O…

jackylu0124 updated 2 months ago
1

上一页 1...7 8 9 10 11 12 13...100 下一页

1000+ results for triton-inference-server

1000+ results
for triton-inference-server