Users must be able to serve models based on a variety of framework (a.) OOTB support for TensorFlow, PyTorch, scikit-learn models and (d.) Users must be able to serve models from Hugging Face without having to do any additional conversions or configurations
a. Ability to deploy new model versions & deploy % of traffic to new version (canary rollout)
b. Ability to do A/B testing on different model versions
c. Ability to test deployed endpoint directly in the product UI
eg. locations other than the cluster where model deployment is initiated (a.) Support models being deployed to remote (location other than where model deployment is initiated)
Req 17: Support options for KServe and/or ModelMesh - Estimate: RHODS 1.36
Support KServe - 1 model per pod or modelmesh - multiple models per pod (a.) RHODS admins should be able to configure whether they want to use KServe (single model serving + additional functionality), ModelMesh, or both
In response to [this](https://github.com/opendatahub-io/kserve/issues/92#issuecomment-1942222118):
>/close
Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
From Req Document
Req 1: Model Storage
Users must be able to to deploy a model stored in (d.) AWS
Req 2: Model Formats - Estimate: RHODS 1.36
Users must be able to serve models based on a variety of framework (a.) OOTB support for TensorFlow, PyTorch, scikit-learn models and (d.) Users must be able to serve models from Hugging Face without having to do any additional conversions or configurations
Req 7: Deployment Rollouts
a. Ability to deploy new model versions & deploy % of traffic to new version (canary rollout) b. Ability to do A/B testing on different model versions c. Ability to test deployed endpoint directly in the product UI
Req 10: OOTB Deployed model performance metrics
Users must be able to access performance metrics for all deployed models (e.) CPU/GPU/memory utilization
Req 14: Model Serving Runtimes
b. OOTB support for Caikit/TGIS
c. OOTB support for NVIDIA Triton Inference Server
Req 15: Remote Deployment
eg. locations other than the cluster where model deployment is initiated (a.) Support models being deployed to remote (location other than where model deployment is initiated)
Req 17: Support options for KServe and/or ModelMesh - Estimate: RHODS 1.36
Support KServe - 1 model per pod or modelmesh - multiple models per pod (a.) RHODS admins should be able to configure whether they want to use KServe (single model serving + additional functionality), ModelMesh, or both
Other planned features
Other planned enhancements
Other planned bug fixes
Resources
Model Serving Phase 2 Requirements doc Model Serving Phase 2 Requirement Mapping spreadsheet