inference-server Search Results

1000+ results
for inference-server

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

PABannier/bark.cpp #195

Support fish.audio

Hi! I heard about a very promising model some while ago that you might be interested in. It's called fish.audio. Here's a youtube demo : https://www.youtube.com/watch?v=Ghc8cJdQyKQ Here's the…

thiswillbeyourgithub updated 1 month ago
2
codeproject/CodeProject.AI-Server #129

CPAI 2.8.0 and Coral 2.4.0 — Changing the model and model si…

**Area of Concern** - [x] Server - [x] Behavior of one or more Modules: Coral - [ ] Installer - [ ] Runtime [e.g. Python3.7, .NET] - [ ] Module packages [e.g. PyTorch) - [ ] Something else **…

melyux updated 1 week ago
4
triton-inference-server/server #7472

Triton crashes with SIGSEGV (signal 11)

**Description** Triton receives SIGSEGV during handling the traffic. Last thing that it wrote out was `E0723 11:57:36.328641 1 infer_handler.h:187] ""[INTERNAL] Attempting to access current response …

JindrichD updated 1 month ago
4
triton-inference-server/pytriton #89

How to use VLMs with pytriton and vllm

**Description** I want to use VLMs with pytriton and vllm backend. Currently I am using sample script given at https://github.com/triton-inference-server/pytriton/blob/main/examples/vllm/server.py …

sourabh-patil updated 21 hours ago
2
Sinaptik-AI/pandas-ai #1326

wrong result or no result when using llama3.1 / codellama

### System Info Apple M2, Sonoma 14.6 (23G80), Python 3.12.5, pandasai 2.2.14 ### 🐛 Describe the bug The getting started example (https://docs.pandas-ai.com/library#smartdataframe) produces a wrong…

tobias-schuele updated 4 days ago
1
NVIDIA/TensorRT-LLM #2412

Exporting Finetuned Llama models to TensorRT-LLM

I have Finetuned Llama2 model with LORA for QA task and now for inference/ streaming I would like to use Triton-llm which requires TensorRT model format. Is there any source code/ resources that I ca…

DeekshithaDPrakash updated 10 hours ago
1
triton-inference-server/server #7495

How to fetch the s3 model repository path of a running trito…

I run the triton server using the following commands S3_REPO="s3://.../models/repository/" docker run --rm --net=host --gpus=all nvcr.io/nvidia/tritonserver:23.11-py3 tritonserver --model-reposito…

sathiyabalu89 updated 1 month ago
1
qdrant/fastembed #395

[Feature]: How can we deploy FastEmbed externally as an Infe…

### What feature would you like to request? I would like to deploy fastembed as an external service, similar to [infinity](https://github.com/michaelfeil/infinity). Can we do that? ### Is there any …

S1LV3RJ1NX updated 6 days ago
1
containers/podman-desktop-extension-ai-lab #982

Stop using pulling update for Inference server health

We are currently using a _pulling update_ mechanism to get the health check. https://github.com/containers/podman-desktop-extension-ai-lab/blob/529bc5bef181032081fb5a616c0de7afabd27c4e/packages/bac…

axel7083 updated 5 months ago
1
huggingface/text-embeddings-inference #431

Run TEI model on CPU fails (says Cuda f16 and flash attentio…

### System Info OS: Windows 11 Rust version: cargo 1.75.0 (1d8b05cdd 2023-11-20) Hardware: CPU AMD 6800HS (text-generation-launcher --env didn't work) ### Information - [ ] Docker - [X] The CL…

Astlaan updated 3 weeks ago
1

上一页 1...12 13 14 15 16 17 18...100 下一页

1000+ results for inference-server

1000+ results
for inference-server