triton-server Search Results

1000+ results
for triton-server

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

InftyAI/llmaz #21

Liveness & Readiness support

Add the support for inference services.

kerthcet updated 1 month ago
3
bioimage-io/bioimageio-colab #3

SAM and the bioengine / bioimageio-colab

## Running SAM in the Modelzoo Universe We have started with some efforts on integrating SAM with the bioengine / imjoy / bioimageio-colab. I want to summarize here the overall goals, the current …

constantinpape updated 5 months ago
6
triton-inference-server/server #7222

Unable to use pytoch library with libtorch backend when usin…

**Description** A clear and concise description of what the bug is. I am trying to use the newly introduced [triton inference server In-Process python API](https://github.com/triton-inference-server…

sivanantha321 updated 5 months ago
10
NVIDIA/TensorRT-LLM #1507

Using Phi-2 with A100 (160GB) and Triton server 24.02 hangs …

### Environment If applicable, please include the following: **CPU architecture:** x86_64 **CPU/Host memory size:** 440 GiB memory ### GPU properties GPU name: A100 GPU memory size: 160G…

kelkarn updated 6 months ago
5
NVIDIA/TensorRT-LLM #1813

how to deploy multimodal model like llava on triton server? …

lss15151161 updated 3 months ago
2
triton-inference-server/server #5372

Metrics from Metric port being mixed when both Triton Model …

**Description** Currently, if both containers for model analyzer and Triton inference server are being deployed, so while collecting data from respective sources from their metrics endpoint port, Thi…

ApoorveK updated 1 year ago
1
triton-inference-server/server #7394

Triton inference is slower than tensorRT

**Description** Im using a simple client inference class base on client example. My tensorRT inference with batchsize 10 with 150ms and my triton with tensorRT backend took 1100ms. This is my client:…

namogg updated 2 months ago
2
triton-inference-server/server #7001

How to control pipeline of the ensemble model to moving data…

As mentioned the end of https://github.com/triton-inference-server/server/issues/6981 triton: nvcr.io/nvidia/tritonserver 23.12-py3 I have 4 GPUs, and my model is ensemble model, I don't set gp…

lzcchl updated 7 months ago
6
triton-inference-server/onnxruntime_backend #116

model with triton inference server is 3x slower than the mod…

**Description** I run the model on triton inference server and also on ORT directly. Inference time on triton inference server is 3 ms, but it is 1 ms on ORT. In addition, there isn't any communicati…

farzanehnakhaee70 updated 1 year ago
3
triton-inference-server/server #4095

Is it possible to make gRPC to use a unix socket instead of …

We have a streaming service that uses gRPC with Unix sockets. The gRPC performs way better with Unix socks in comparison with a TCP port. I saw that you can only change the port in the triton server…

PauloFavero updated 1 month ago
7

上一页 1...29 30 31 32 33 34 35...100 下一页

1000+ results for triton-server

1000+ results
for triton-server