iterative / example-repos-dev

Source code and generator scripts for example DVC projects
https://dvc.org/doc
21 stars 13 forks source link

`example-gto`: remove `annotate` and move artifacts to `dvc.yaml` #192

Closed aguschin closed 1 year ago

aguschin commented 1 year ago

cc @jellebouwman and @amritghimire re https://github.com/iterative/studio/issues/5504 - I'm not sure demo-bank-customer-churn is a monorepo, so if you need one, we can do this ^

amritghimire commented 1 year ago

the repo can be re-generated already, although CI will fail

For studio, this wont be an issue since we use the frozen repo for tests. We can take care of the test cases in upgrade of GTO. I agree we can add mono repo for the test cases. But I think even with upgrade, the repo with the old artifacts.yaml and the information in dvc.yaml should both work. Or at the very least, when doing some action like registration and so on, it should migrate the data accordingly. WDYT?

jellebouwman commented 1 year ago

I'm not sure demo-bank-customer-churn is a monorepo, so if you need one, we can do this ^

I don't think it is, it's adding different types of models on different branches, but everything is added to the root of the repository!

aguschin commented 1 year ago

But I think even with upgrade, the repo with the old artifacts.yaml and the information in dvc.yaml should both work.

To smoothen migration, we can implement if-else, supporting both old-format GTO repo and new format. Something like:

artifacts = dvc.repo.Repo(".").artifacts
if not artifacts:
  artifacts = gto.api.show(".")

Or at the very least, when doing some action like registration and so on, it should migrate the data accordingly. WDYT?

GTO API will not change regarding registrations/assignments, but annotation will be removed, so we'll need to use internal DVC API for that, or update dvc.yaml manually.

So, what should be done in Studio (I'm surely missing something though):

  1. Updating DVC version used (we're yet to release it in DVC though)
  2. Supporting old (artifacts.yaml, GTO API) / new(dvc.yaml, DVC API) formats for reading (this try..except ^)
  3. Supporting new format for annotation, removing old format for annotation
  4. Adding new test repo (re-generated example-gto)
  5. Showing a warning for old format users, telling them they should migrate

@amritghimire - I'm happy to start drafting this in BE, let's have a call so you can point me towards right modules/scripts? Thanks!

aguschin commented 1 year ago

@omesser can you please approve and merge? Thanks!