inference-server Search Results

1000+ results
for inference-server

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

triton-inference-server/server #6486

[Question] can the TensorRT backend support python defined p…

Looking at the release of TensorRT 9.1.0. I am very happy to see the integration of openai-triton with TensorRT plugins. However [one limitation of this integration is that python must be availabl…

MatthieuToulemont updated 11 months ago
1
fauxpilot/fauxpilot #133

How to run multiple models on one machine (w/ one GPU)

I am running inference tasks conveniently with CodeGen models, thanks to the FauxPilot community. Thank you again. Additionally I wonder if it is possible to run multiple models on a single GPU. Bel…

leemgs updated 1 year ago
4
sgl-project/sglang #634

Development Roadmap (2024 Q3)

Here is the development roadmap for 2024 Q3. Contributions and feedback are welcome. ## Server API - [ ] Add APIs for using the inference engine in a single script without launching a separate se…

Ying1123 updated 1 week ago
18
triton-inference-server/server #7406

Version specific config.pbtxt

We would like to be able to deploy multiple versions of the same model. Unfortunately, they will not necessarily always have the same shapes and dtypes. It would be great to have a per version con…

lminer updated 2 months ago
2
Neural-Dragon-AI/Cynde #7

make some tests and choose an openai api compatible local ll…

https://github.com/ollama/ollama https://github.com/abetlen/llama-cpp-python https://github.com/vllm-project/vllm

furlat updated 6 months ago
4
StonyBrookNLP/ircot #24

How are prompts being picked up?

I am trying to experiment with prompts and I'm unable to check whether the system is picking up my changed prompts? 1. I have overwritten the "prompt_file" for my experiment (found by checking out…

mohdsanadzakirizvi updated 3 months ago
5
kserve/kserve #2744

Proto file conflict when I use MLServer for MLFlow

/kind bug **What steps did you take and what happened:** When I import `mlserver` and `kserve` at the same time, they might have the same name of proto descriptors that conflict each other like:…

jinserk updated 1 year ago
2
immich-app/immich #12594

Immich autobackup fills up iPhone System Data

### The bug The Immich backup feature that allows photos to be backed up to the remote server causes the system data on my iPhone 15 Pro Max with iOS 17.6.1 to be filled up completely. This causes th…

sevvlor updated 2 weeks ago
6
triton-inference-server/server #4095

Is it possible to make gRPC to use a unix socket instead of …

We have a streaming service that uses gRPC with Unix sockets. The gRPC performs way better with Unix socks in comparison with a TCP port. I saw that you can only change the port in the triton server…

PauloFavero updated 5 days ago
7
QwenLM/Qwen-VL #336

💡 [REQUEST] - <title> 请问何时能支持vllm部署呢

### 起始日期 | Start Date _No response_ ### 实现PR | Implementation PR _No response_ ### 相关Issues | Reference Issues _No response_ ### 摘要 | Summary vllm-0.3.0起服务失败 ### 基本示例 | Basic Example not supp…

su-zelong updated 2 months ago
4

上一页 1...81 82 83 84 85 86 87...100 下一页

1000+ results for inference-server

1000+ results
for inference-server