Open karlschriek opened 4 years ago
/area engprod /priority p2
Hi @karlschriek when you mention "model management" are you referring to easily be able to see just which experiments have been run or are you thinking more deeply about understanding which models are deployed, similar to the MLFlow model registry?
I too am at the evaluation stage and wondering this question. I'm loving pipelines and discovered the kfserving
pipeline component which seems like a convenient way to deploy models to different environments once they've passed some validation. Hook that up with Argo's suspend template functionality (for manual approvals if you need them) and you've got a pretty nice looking deployment pipeline. The next problem then becomes neatly tracking what is actually running in each environment and how it got there (ie. which pipeline) via some UI+datastore.
Yes, in fact at the moment we roll out MLFlow within our Kubeflow Cluster, as it seems to be the most mature solution that is currently out there. However, MLFlow doesn't nearly cover what we need
When I talk about model management, I broadly mean a system that abstracts away the nitty-gritty of:
For the moment we need to rely on a lot of bespoke code to do this.
/kind feature
Describe the solution you'd like In the circles that I move around in - i.e. developers and product owners who need to create productive Machine Learning solutions - one of the key issues that everyone is try to solve is Model Management (which really comes down to proper logging of Metadata, tracking of Lineage etc.). There are a few solutions out there that are gaining in popularity (see MLFlow from Databricks) but there is nothing (aside from the TFX MLDB, which KF Metastore seems to be based on) that offers real integration with a project of the size and ambition of Kubeflow.
I realise that
metastore
is currently still in early alpha, and don't get me wrong, I'm really looking forward to see where you guys take it. But given the above I do actually find it quite surpising thatmetastore
is (compared topipelines
for example) currently still such a small project.We are currently evaluating whether Kubeflow is the right fit for our organisation and one of the pivotal points is how we will do model management on the platform. We would find it very useful to know what the longer term vision is here. What sort of functionality are you planning to build? How will
metastore
integrate with other components, such aspipelines
? How do you plan to get community/user input? When do you plan to make examples available for the user community to get into? Kubeflow is target a v1.0 release by early 2020. Where do you seemetastore
at that point? Etc.Anything else you would like to add: Not meant as criticism. I think the whole Kubeflow community is awesome!