-
### 🐛 Describe the bug
This case is not working :https://kserve.github.io/website/0.11/modelserving/v1beta1/torchserve/#deploy-pytorch-model-with-v2-rest-protocol.
The isvc object is ready when usin…
-
Hi there,
I'm trying to deploy an endpoint that has bursts of high load. I'd like the endpoint to batch requests so we can increase through put under high load at the cost of a slight increase in l…
-
Provide Pros, Cons, and final recommendation(s)
https://docs.bentoml.org/en/latest/
-
## 🚀 The feature
- Reduce TorchServe CPU Image size by 25% using slim as the base image
- Refactor TorchServe Dockerfile to support slim based CPU & GPU Docker images and setup docker ci github ac…
-
I have the following Torchserve handler and dockerfile, but I’m getting prediction failed:
`from ts.torch_handler.base_handler import BaseHandler
from transformers import AutoModelWithLMHead, Auto…
-
We have 2 onnx models deployed in a GPU machine built on top of the nightly docker image.
- The first model runs with 0 failure at 500 QPS (p99 latency < 8ms) during a 2-hour perf test.
- The seco…
-
Currently, the startup code will repackage the model contents in `environment.model_dir` into TS format using the TS model archiver: https://github.com/aws/sagemaker-pytorch-inference-toolkit/blob/mas…
-
### 🐛 Describe the bug
cpu launcher doesn't work
### Error logs
```
2024-03-28T08:00:31,792 [DEBUG] W-9000-embeddings_1.1 org.pytorch.serve.wlm.WorkerLifeCycle - launcherAvailable cmdline: […
-
### 🚀 The feature
Integrate https://github.com/libffcv/ffcv for accelerated image decoding, preprocessing and loading
### Motivation, pitch
I maintain [torchserve](https://github.com/pytorch/…
-
**Is your enhancement request related to a problem? Please describe.**
Over the last six months, we have been building towards packaging the MONAI model in different flavors. The focus of these effor…