triton-inference-server Search Results

1000+ results
for triton-inference-server

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

NVIDIA/TensorRT-LLM #1768

Using TensorRT-LLM/examples/apps/fastapi_server.py as server…

### System Info - CPU architecture : x86_64 - CPU/Host memory size : 32 GB - GPU name L4 at g2-standard-8 (GCP) - GPU memory size 24GB - TensorRT-LLM branch or tag (e.g., main, v0.10.0) - Nvi…

snassimr updated 1 month ago
15
huggingface/blog #972

How to package Hugging Face into Nvidia Triton Inference Ser…

I was recently deploying hugging face models on the Triton inference server which helped me to increase my GPU utilization and serve multiple models using a single GPU. Was not able to find good r…

nickaggarwal updated 5 months ago
25
DeepRec-AI/DeepRec #942

Can deeprec processor be used in triton inference server?

We use triton inference server for online inference, Can deeprec processor be used in triton inference server?

supercocoa updated 10 months ago
2
ELS-RD/transformer-deploy #173

convert_model command not found

Hello get docker image 0.6.0. Just tried to run the two demo command: 1. docker run -it --rm --gpus all \ -v $PWD:/project ghcr.io/els-rd/transformer-deploy:0.6.0 \ bash -c "cd /project && \ …

pint1022 updated 4 months ago
3
triton-inference-server/server #7358

why is only 1st 'batch' inferred?

I have an ensemble model, model 1 output are 66 cropped images, model 1 is python, I manually resize/padded them to 3 batches with shape (30, 3, 48, 320), (30, 3, 48, 976), (6, 3, 48, 1280) (I …

mlfrd updated 1 day ago
1
opendatahub-io/modelmesh-serving #184

[P0] [SPIKE] - OOTB support for NVIDIA Triton Inference Serv…

From req doc: **OOTB support for NVidia Triton Inference Server** - We are going with OpenVINO right now as Triton can not be built right now due to maintenance concerns. Acceptance criteria: - Scope…

heyselbi updated 9 months ago
3
triton-inference-server/server #3926

Does triton-inference-server run on Drive AGX?

**Is your feature request related to a problem? Please describe.** 1. We would like to try parallel model execution on iGPU+DLA devices. Is it possible to run triton-inference-server on a V3NP or Ori…

jayxio updated 2 years ago
1
triton-inference-server/server #6189

Docker build fails because of maybe-uninitialized warning

**Description** I am trying to build a triton docker image following the https://github.com/triton-inference-server/server/blob/r23.07/docs/customization_guide/build.md#building-with-docker Using …

mapa17 updated 3 months ago
2
triton-inference-server/client #777

Failing with Generic Error message: Failed to obtain stable …

I am testing on the basic models. Model take input and return the same output of same datatype. Inference is happening: 2024-08-20 09:35:15,923 - INFO - array_final: array([[103]], dtype=uint8) a…

Kanupriyagoyal updated 1 day ago
11
triton-inference-server/paddlepaddle_backend #21

Any plan to update to latest triton version? (23.07)

So far the latest publicly available triton inference server with paddle backend is `paddlepaddle/triton_paddle:21.10` and there are lots of bug fixes since then. I'm experiencing an increasing amount…

bdeng3 updated 1 year ago
2

上一页 1...9 10 11 12 13 14 15...100 下一页

1000+ results for triton-inference-server

1000+ results
for triton-inference-server