inference-server Search Results

1000+ results
for inference-server

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

exo-explore/exo #131

Docs: Linux Example Script

## Description of Request - Update the documentation and examples for running `exo` on Linux nodes ## Reason or Need for Feature - Linux is the dominant of choice for running workloads on se…

da-moon updated 1 month ago
1
PaddlePaddle/Serving #1974

paddleServing部署电表检测模型，不确定预期输出

1、前期模型训练按照下面的项目方式进行： [https://aistudio.baidu.com/projectdetail/3429765?channelType=0&channel=0](https://mailshield.baidu.com/check?q=AlEwhtE5zdzlkKzKpXr5RNbSCGSPA1DPqTt6pRDbyt76zl6VLgvykIxfaprvmj3w…

gubinjie updated 7 months ago
2
NVIDIA/DALI #4725

Will operations like sort, argsort or nms get implemented?

DALI is pretty useful for postprocessing when using ensemble model in Triton Inference Server. Will the commonly used operations get implemented in the future?

SunXuan90 updated 1 year ago
4
ksugar/qupath-extension-sam #16

CUDA Out of Memory Error During Inference in samapi Environm…

While running inference tasks in the `samapi` environment, I encountered a `CUDA out of memory` error, causing the application to fallback to CPU inference. This issue significantly impacts performanc…

halqadasi updated 5 months ago
4
LlamaEdge/LlamaEdge #136

Feature Request: add Number of Threads for better performanc…

### Summary Is it to add add Number of Threads as the same parameter in llama.cpp ### Appendix _No response_

njalan updated 1 week ago
7
triton-inference-server/tensorrtllm_backend #388

SAFETENSORS and OpenAI style endpoint

### System Info I have searched the repo here and the main server repo but don't see any information on either a) support for Safetensors (many models are saved that way on HF) and also b) whether th…

RonanKMcGovern updated 2 months ago
5
microsoft/JARVIS #62

OSError: [Errno 99] Cannot assign requested address

When I run the models_server.py in aws , OSError: [Errno 99] Cannot assign requested address. How can I deploy the service on the cloud server and I download all model in cloud . And if i set config…

kelisiya updated 1 year ago
2
kvablack/LLaVA-server #6

Prompt Image Alignment Experiment

Hi Kevin, when I'm trying to reproduce the Prompt Alignment Experiment, I downloaded the llava_server codebase using weights from "liuhaotian/llava-v1.5-7b" first, when I run gunicorn "app…

yigu1008 updated 1 day ago
4
ollama/ollama #1246

Status endpoint needed

Hello! I found a non-urgent issues in the API that makes UX much worse when working with models from web or with remote servers because we can't see current state of a ollama: is it downloading mod…

ex3ndr updated 2 months ago
1
xenova/transformers.js #875

Next.js server example is broken

### System Info ``` node -v v22.3.0 ``` ``` git show -s commit 7f5081da29c3f77ee830269ab801344776e61bcb (HEAD -> main, origin/main, origin/HEAD) Author: Joshua Lochner Date: Tue Jul 2 …

nguyenmp updated 3 weeks ago
6

上一页 1...70 71 72 73 74 75 76...100 下一页

1000+ results for inference-server

1000+ results
for inference-server