model-serving Search Results

1000+ results
for model-serving

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

intel-analytics/ipex-llm #10690

provide support for model serving using FastAPI deepspeed+ip…

Hi, Could you please help provide guide on integrating deepspeed approach of using multi-GPU Intel Flex 140 to run model inference using FastAPI and uvicorn setting ? model id: 'meta-llama/Llama-2-7…

nazneenn updated 7 months ago
3
deepjavalibrary/djl #3360

How to serv with 2 model folders?

![image](https://github.com/user-attachments/assets/4fa65867-6cbf-489f-9b12-0ba881b1347e) I have 2 model folders for llama3, one is the original and another is the finetuned, how to config to use the…

SidneyLann updated 3 months ago
17
socar-inc/techblog-comments #11

data/2020/03/10/ml-model-serving.html

# 딥러닝 모델 Serving 간단 구축기 - SOCAR Tech Blog https://tech.socarcorp.kr/data/2020/03/10/ml-model-serving.html

socar-kyle updated 2 years ago
11
salesforce/LAVIS #193

can I trans model to onnx for online serving?

xealml updated 1 year ago
6
pytorch/serve #2291

Request Prioritization by Header Value

### 🚀 The feature Inference requests are stored in a prioritized data structure. The priority of a request can be set via a custom header value. The priority values are categorical (e.g. `LOW`, `HIGH…

mpoemsl updated 6 days ago
7
vllm-project/vllm #2942

Multi-GPU Support Failures with AMD MI210

Hello. Thank for providing vLLM as a great open-source tool for inference and model serving! I was able to build vLLM on a cluster I maintain, but it only appears to work on a single MI210 GPU. Can so…

tom-papatheodore updated 2 weeks ago
7
vllm-project/vllm #7587

[Bug]:When running inference in the 0.5.0 environment, an er…

### Your current environment version 0.5.0 ### 🐛 Describe the bug Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/uvicorn/protocols/http/httptools_impl.…

zjjznw123 updated 1 day ago
1
triton-inference-server/server #7316

When the request is large, the Triton server has a very high…

**Description** I run benchmark of Meta-Llama-3-8B-Instruct in RTX 8*4090, ![image](https://github.com/triton-inference-server/server/assets/68674291/1a0fd341-8d8f-4893-973c-ed1ed3b74aca) when r…

Godlovecui updated 2 months ago
2
matterport/Mask_RCNN #1468

Deploy Converted saved_model.pb with TensorFlow Serving

Hi, I successfully converted Keras model to serving_model using this repository many thanks to @bendangnuksung. Now I am preparing the client api side. Here is the image loading part of api: `if l…

AliGharbali updated 4 years ago
10
PaddlePaddle/Serving #1535

执行 paddle serving 报错

## 问题： ### Q1: 执行如下代码时报错： ```bash export SERVING_BIN=/usr/local/serving_bin/serving python -m paddle_serving_server.serve \ --model ./serving_server \ --thread 8 --port 10010 \ --gpu_ids 0 …

zouxiaoshi updated 11 months ago
7

上一页 1...23 24 25 26 27 28 29...100 下一页

1000+ results for model-serving

1000+ results
for model-serving