inference-server Search Results

1000+ results
for inference-server

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

immich-app/immich #14189

Image randomly corrupted when generating webp thumbnails wit…

### The bug This is original image files. ![thumbnail (1)](https://github.com/user-attachments/assets/d0d332c4-d6d6-4f4c-980c-4506d10e4613) ![thumbnail](https://github.com/user-attachments/as…

dudgns0507 updated 14 hours ago
2
NVIDIA/DIGITS #1915

Could you give me an example for "publish to inference serve…

I found that you had added a new feature( publish to inference server) to digits6.1, however I don't know how to use it.

LLCF updated 6 years ago
2
EleutherAI/lm-evaluation-harness #1963

How to use a vllm hosted model?

Are there docs on best practices for using vllm hosted models? I create a model using python -m vllm.entrypoints.openai.api_server --model model_path and try running it as lm_eval --model lo…

darsh-essential updated 5 months ago
1
ParisNeo/lollms #8

Multiuser and simultaneous requests

Congrats on great project! I started playing with it and have two questions so far: 1) Does it support different sessions and several users? 2) Does it support simultaneous requests for inference?…

KonstantinMastak updated 1 year ago
3
microsoft/vscode #221620

Problem comparing files

Type: Performance Issue I wanted to compare two large files, it didn't work (might been too large/complex?). I also tried switching to non-advanced diff algorithm (`diffEditor.diffAlgorithm` setting…

AgainPsychoX updated 3 weeks ago
3
triton-inference-server/client #731

Memory leak from grpcio

I tested `tritonclient:2.43.0` on Ubuntu:22.04 with `grpcio:1.62.1` and was confronted with a memory leak. Example for reproduction: ``` import asyncio from tritonclient.grpc.aio import Inferen…

AlexanderKomarov updated 2 months ago
1
facebookresearch/seamless_communication #460

Deployment of Seamless M4T Model - Exporting text.decoder to…

#### Description I am currently working on deploying the Seamless M4T model for text-to-text translation on a Triton server. I have successfully exported the `text.encoder` to ONNX and traced it …

HesamAlavian updated 1 month ago
2
openvinotoolkit/model_server #2719

OpenAI API completions endpoint - Not working as expected

I have downloaded LLAMA 3.2 1B Model from Hugging face with optimum-cli optimum-cli export openvino --model meta-llama/Llama-3.2-1B-Instruct llama3.2-1b/1 Below are files downloaded !…

anandnandagiri updated 3 weeks ago
12
aio-libs/aiohttp-apischema #40

Type inference by decorator parameters instead of function r…

Hello and thanks for a great project! I wondered whether there's any interest in support for type inference at the decorator level rather than by changing the return type of methods? Right now I…

cjllanwarne updated 3 weeks ago
2
irthomasthomas/undecidability #922

codelion/optillm - Automatic prompt strategy proxy

- [ ] [optillm/README.md at main · codelion/optillm](https://github.com/codelion/optillm/blob/main/README.md?plain=1) # optillm optillm is an OpenAI API compatible optimizing inference proxy whi…

ShellLM updated 2 months ago
1

上一页 1...50 51 52 53 54 55 56...100 下一页

1000+ results for inference-server

1000+ results
for inference-server