Onboard Seldon Core - Githubissues

Use Case

Onboarding Seldon Core makes sense in consideration of MLFlow, Kubeflow, Grafana/Prometheus as possible Inference server. This combination can either run standalone or e.g. in combination with MLFlow / Kubeflow and can utilize all common serving backends (probably most interesting Triton).

It should be coupled with Prometheus / Grafana and potentially even include the ability to scale serving based on metrics

Ideas of Implementation

This combo would allow best in class inference performance if configured correctly (see here - https://towardsdatascience.com/hugging-face-transformer-inference-under-1-millisecond-latency-e1be0057a51c or here - https://developer.nvidia.com/blog/deploying-nvidia-triton-at-scale-with-mig-and-kubernetes/) for almost any use case out there.

Additional Info

Message from the maintainers:

Excited about this feature? Give it a :thumbsup:. We factor engagement into prioritization.

pluralsh / plural-artifacts

Onboard Seldon Core #180

Use Case

Ideas of Implementation

Additional Info

Message from the maintainers: