-
This would provide the ability to serve multiple models, and multiple versions of each model, with a single serving instance.
Details can be seen here: [https://www.tensorflow.org/tfx/serving/servi…
-
One concern for the pyspark model serving is the real time performance or latency.
Clipper provides a wrapper of pyspark session, as mentioned in the document:
The model container creates a long…
-
Official documentation: https://docs.databricks.com/api/azure/workspace/servingendpoints/putaigateway
### Use-cases
Databricks now provide additional controls to be applied on external serving…
-
### ClearML serving design document v2.0
**Goal: Create a simple interface to serve multiple models with scalable serving engines on top of Kubernetes**
Design Diagram (edit [here](https://excalid…
-
## Bug Report
tensorflow-serving docker container doesn't work on Macs with Apple M1 chips.
Do maintainers of tensorflow-serving intend to solve this?
Or do they see this as a problem somewhere u…
-
### Your current environment
```text
python benchmark_serving.py --backend tgi --model /model/Mixtral_email_sft --dataset /usr/src/dataset/ShareGPT_V3_unfiltered_cleaned_split.json --port 8080 --num…
-
/kind bug
**What steps did you take and what happened:**
Tried the following
1. Tried creating memory based autoscaling using knative annotations as below. A CPU based HPA was created instead w…
-
### Your current environment
Collecting environment information...
PyTorch version: 2.3.1+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A
OS: Ubu…
-
### Your current environment
v0.5.2. vLLM env is not an issue so I will just skip the collection process
### 🐛 Describe the bug
I am running benchmark tests and notice one potential problem. …
-
## Feature Request
If this is a feature request, please fill out the following form in full:
### Describe the problem the feature is intended to solve
For now, tensorflow serving exports metric…