iterative / dvc.org

📖 DVC website and documentation
https://dvc.org
Apache License 2.0
326 stars 386 forks source link

Getting Started - Model registry and deployment #4841

Closed tibor-mach closed 8 months ago

tibor-mach commented 10 months ago

Report

We have been discussing with @daavoo and also with @shcheklein that it might be a good idea to add the model registry and model deployment to the getting started guides in Docs as well.

On one hand, very little of that is actually about using DVC from a technical standpoint, it is mostly using DVC with other tools (the only thing DVC in the MR and in deployment is the artifacts entry in dvc.yaml).

On the other hand, I think that Getting started should take the users through an E2E MLOps pipeline and even if DVC is not used much in the latter stages, other tools from Iterative are (GTO, potentially CML). If we want to centralize "branding" around DVC, then I think it fits well and also it does not give an impression that we can only help with data management and experiment tracking.

I would try to do the getting started MR and deployment guide (guides?) as simples as possible, probably focused on DVCLive and a minimal usage of the CLI...these should be simple and easy to understand examples after all (and we could have more advanced ones under the User Guide part of the docs).

My plan is to first finish a DVCLive MR demo and then to adapt it to the getting started guide (via a generated repo) as well.

I will appreciate your comments on that @shcheklein @daavoo @dberenbaum @dmpetrov .

Related to #4789

dberenbaum commented 10 months ago

Thanks for opening the issue @tibor-mach! Agree with your plan. Makes sense to me to do it in get started. Since we already have deployment in https://github.com/iterative/example-get-started-experiments/, do you think it makes sense to start with that? Or you think that is too complex and we need a simpler example?

tibor-mach commented 10 months ago

@dberenbaum I think it makes sense to keep it all in that repo. I would use the notebook-based pipeline though and start the model registry tutorial at this tag.

That way, someone who is mostly interested in the MR capabilities can understand the repo faster and doesn't really need to understand dvc pipelines. I would just note in the tutorial that with pipelines the workflow is basically identical anyway.

I would show it with Studio first (by adding very short videos to the getting started docs, I guess), but also note that the GUI commands can be done with gto (they are all just one-liners so I don't think it complicates things too much).

tibor-mach commented 8 months ago

This was resolved by #4883, closing.