opendatahub-io / kserve

Standardized Serverless ML Inference Platform on Kubernetes
https://kserve.github.io/website/
Apache License 2.0
1 stars 21 forks source link

[DEV TRACKER] Model Serving Requirements for Q4 #92

Closed heyselbi closed 7 months ago

heyselbi commented 1 year ago

From Req Document

Req 1: Model Storage

Users must be able to to deploy a model stored in (d.) AWS

Req 2: Model Formats - Estimate: RHODS 1.36

Users must be able to serve models based on a variety of framework (a.) OOTB support for TensorFlow, PyTorch, scikit-learn models and (d.) Users must be able to serve models from Hugging Face without having to do any additional conversions or configurations

Req 7: Deployment Rollouts

a. Ability to deploy new model versions & deploy % of traffic to new version (canary rollout) b. Ability to do A/B testing on different model versions c. Ability to test deployed endpoint directly in the product UI

Req 10: OOTB Deployed model performance metrics

Users must be able to access performance metrics for all deployed models (e.) CPU/GPU/memory utilization

Req 14: Model Serving Runtimes

b. OOTB support for Caikit/TGIS

c. OOTB support for NVIDIA Triton Inference Server

Req 15: Remote Deployment

eg. locations other than the cluster where model deployment is initiated (a.) Support models being deployed to remote (location other than where model deployment is initiated)

Req 17: Support options for KServe and/or ModelMesh - Estimate: RHODS 1.36

Support KServe - 1 model per pod or modelmesh - multiple models per pod (a.) RHODS admins should be able to configure whether they want to use KServe (single model serving + additional functionality), ModelMesh, or both

Other planned features

Other planned enhancements

Other planned bug fixes

Resources

Model Serving Phase 2 Requirements doc Model Serving Phase 2 Requirement Mapping spreadsheet

israel-hdez commented 7 months ago

Closing, as we are now tracking work on Jira.

israel-hdez commented 7 months ago

/close

openshift-ci[bot] commented 7 months ago

@israel-hdez: Closing this issue.

In response to [this](https://github.com/opendatahub-io/kserve/issues/92#issuecomment-1942222118): >/close Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.