model-serving Search Results

1000+ results
for model-serving

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

OpenNMT/nmt-wizard-docker #108

Support for serving multiple models with a single instance

This would provide the ability to serve multiple models, and multiple versions of each model, with a single serving instance. Details can be seen here: [https://www.tensorflow.org/tfx/serving/servi…

alexkillen updated 4 years ago
3
ucbrise/clipper #499

Ask about the performance issue of pyspark model serving

One concern for the pyspark model serving is the real time performance or latency. Clipper provides a wrapper of pyspark session, as mentioned in the document: The model container creates a long…

parkerzf updated 5 years ago
3
databricks/terraform-provider-databricks #4029

[FEATURE] Support for ai_gateway settings of external servin…

Official documentation: https://docs.databricks.com/api/azure/workspace/servingendpoints/putaigateway ### Use-cases Databricks now provide additional controls to be applied on external serving…

VOVELEE updated 2 months ago
1
allegroai/clearml-serving #17

ClearML serving design v2

### ClearML serving design document v2.0 **Goal: Create a simple interface to serve multiple models with scalable serving engines on top of Kubernetes** Design Diagram (edit [here](https://excalid…

bmartinn updated 2 years ago
8
tensorflow/serving #1948

tensorflow-serving docker container doesn't work on Macs wit…

## Bug Report tensorflow-serving docker container doesn't work on Macs with Apple M1 chips. Do maintainers of tensorflow-serving intend to solve this? Or do they see this as a problem somewhere u…

kuba-lilz updated 9 months ago
18
vllm-project/vllm #4936

text_generation_router::infer: router/src/infer.rs:130: no p…

### Your current environment ```text python benchmark_serving.py --backend tgi --model /model/Mixtral_email_sft --dataset /usr/src/dataset/ShareGPT_V3_unfiltered_cleaned_split.json --port 8080 --num…

Ling-CF updated 2 months ago
1
kserve/kserve #3638

Autoscaling with multiple metrics does not work

/kind bug **What steps did you take and what happened:** Tried the following 1. Tried creating memory based autoscaling using knative annotations as below. A CPU based HPA was created instead w…

shazinahmed updated 5 months ago
3
vllm-project/vllm #6768

[Usage]: How to inference a model with medusa speculative sa…

### Your current environment Collecting environment information... PyTorch version: 2.3.1+cu121 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build PyTorch: N/A OS: Ubu…

deepindeed2022 updated 1 month ago
2
vllm-project/vllm #6531

[Bug]: inter-token latency is lower than TPOT in serving ben…

### Your current environment v0.5.2. vLLM env is not an issue so I will just skip the collection process ### 🐛 Describe the bug I am running benchmark tests and notice one potential problem. …

Jeffwan updated 1 month ago
13
tensorflow/serving #1959

Export `:tensorflow:serving:...` metrics by signature names

## Feature Request If this is a feature request, please fill out the following form in full: ### Describe the problem the feature is intended to solve For now, tensorflow serving exports metric…

jeongukjae updated 1 year ago
4

上一页 1...25 26 27 28 29 30 31...100 下一页

1000+ results for model-serving

1000+ results
for model-serving