-
The biggest question is how the heck to figure out that a player has high latency. We could measure latency to the lobby server but that won't necessarily be an accurate prediction of in game connecti…
-
## Context
When the torchserve process initially starts, metrics API endpoints tested return an empty response. While this is a niche case (most likely TS would have served at least 1 prediction be…
-
The README.md references that it is low latency multiple times. Was this ever tested for an exact value or was it just your best prediction?
-
One concern for the pyspark model serving is the real time performance or latency.
Clipper provides a wrapper of pyspark session, as mentioned in the document:
The model container creates a long…
-
### 🐛 Describe the bug
The results obtained by running in different environments vary greatly
- run in macbook pro 2019 metal
- run in the same metal, but use docker 20.10.17
running command …
-
This issue tracks all items needed to get a good tracking system. A well tuned and intelligent control system will give our product the *wow* factor.
- ~~Geometry #6~~
- ~~Prediction~~
- ~~Rew…
-
To reproduce this bug, try below steps.
### Step1 : Start hello-world sample (cache_size=0, slo_micros=100)
```python
from clipper_admin import ClipperConnection, DockerContainerManager
from cli…
-
I am trying to use **Catboost Java API** but facing high latency issues at a large scale. I currently run a high-scale **multi-threaded** system with around 300+ worker threads that query the catboost…
-
It is common when deploying ML models to queue up requests to run them on the GPU all at once to increase throughput.
This is distinct from users being able to run several predictions in one go wit…
-
# Issue
Kubeflow should provide some guidance on serving the following two types of model predictions online:
1. **Precomputed Predictions**
2. **Cached Predictions**
(1) requires retrieving…