triton-server Search Results

1000+ results
for triton-server

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

NVIDIA/TensorRT-LLM #1507

Using Phi-2 with A100 (160GB) and Triton server 24.02 hangs …

### Environment If applicable, please include the following: **CPU architecture:** x86_64 **CPU/Host memory size:** 440 GiB memory ### GPU properties GPU name: A100 GPU memory size: 160G…

kelkarn updated 4 months ago
5
triton-inference-server/fastertransformer_backend #140

FasterTransformer Backend fails to build using latest versio…

### Description ```shell The Docker built fine using the older version mentioned in readme (22.12), but when trying to build using the latest docker (23.05) it fails. See this log file: https://gi…

mshuffett updated 1 year ago
2
triton-inference-server/server #6177

Triton replication on Kubernetes, all traffic forwarded to t…

**Description** I deployed Triton Inference Server on Kubernetes (GKE). To balance the load, I created a Load Balancer Service. As a client, I'm using the Python HTTP client. I was expecting all the …

Vincouux updated 1 month ago
3
triton-inference-server/server #6583

Support for vLLM and TRT-LLM running in OpenAI compatible mo…

**Is your feature request related to a problem? Please describe.** I'd like to be able to run vLLM emulating the OpenAI compatible API to use vLLM as a drop-in replacement of ChatGPT. **Describe…

vecorro updated 2 weeks ago
16
stanford-futuredata/ColBERT #348

GPU crashes when running "D_packed @ Q.to(dtype=D_packed.dty…

Hey, I tried to do ColBERT model inferencing via Triton server in multiple GPUs instance. GPU 0 works fine. However, other GPU devices (1,2,3,... etc) crash when running to this line ```D_pac…

Jimmy9507 updated 2 months ago
1
triton-inference-server/server #7222

Unable to use pytoch library with libtorch backend when usin…

**Description** A clear and concise description of what the bug is. I am trying to use the newly introduced [triton inference server In-Process python API](https://github.com/triton-inference-server…

sivanantha321 updated 3 months ago
10
cms-sw/cmssw #37767

Triton fallback server failure in workflow 10805.31 step 3

(moving from https://github.com/cms-sw/cmssw/issues/37738#issuecomment-1114455507) The workflow 10805.31 step 3 fails with ``` Starting python2 /data/cmsbld/jenkins/workspace/ib-run-relvals/cms-b…

makortel updated 1 year ago
35
triton-inference-server/tensorrtllm_backend #463

Can you provide an example of a visual language model or mul…

there is an example https://github.com/NVIDIA/TensorRT-LLM/tree/main/examples/qwenvl , but I have no idea how can I use this model in triton server, Can you provide an example of a visual language mod…

lzcchl updated 3 months ago
6
collabora/WhisperLive #218

Separate package for client?

I think most of the dependencies that get installed with `pip install whisper-live` are only needed for the server, not the client. How can I use the client without installing all the server's package…

powellnorma updated 3 months ago
2
xhp-hust-2018-2011/SS-DCNet #8

NVIDIA Triton Inference Server Compatibility - Tracing the S…

Hi @xhp-hust-2018-2011 , Thanks for the great work done on this repo. I'm trying to use your prebuilt Pytorch model with [NVIDIA's Triton Inference Server](https://docs.nvidia.com/deeplearning/sdk/…

mohammedayub44 updated 4 years ago
2

上一页 1...19 20 21 22 23 24 25...100 下一页

1000+ results for triton-server

1000+ results
for triton-server