iterative / gto

๐Ÿท๏ธ Git Tag Ops. Turn your Git repository into Artifact Registry or Model Registry.
https://dvc.org/doc/gto
Apache License 2.0
142 stars 16 forks source link

Get artifact version registered at commit #312

Open aguschin opened 1 year ago

aguschin commented 1 year ago

From discord:

is there a way for gto to output the current tag for HEAD? (edited)

well, I know it's redundant because gto keeps the models registered in git tags, but let's say I am saving predictions of a model to a database. I need those predictions to be associated with the model that produced them - so ideally I'd save the predictions together with the model tag. so whenever I'd do gto register ideally I'd be able to save the resulting simplified tag @v1.2.3 to a text file that I can later use to associate predictions with a model (edited)

Now it can be achieved with something like:

$ export NAME="churn"
$ git --no-pager tag --points-at HEAD | xargs -I {} sh -c '
  if [ $(gto check-ref {} --name) = $NAME ] && [ $(gto check-ref {} --event) = "registration" ]; then
    gto check-ref {} --version
  fi
'

We can add a command to GTO, but I need to better understand how it should work first.

Specifically for this case:

If we try to generalize a bit:

Then it can look like

$ gto find HEAD nn --version
v0.0.1
$ gto find HEAD nn --stage
dev
$ gto find HEAD nn
v0.0.1 in dev
$ gto find HEAD
nn v0.0.1 in dev
mymodel v1.2.3 in stage and prod

WDYT? Maybe there are some simpler options or this can be included in existing commands (like gto show or gto check-ref?)

bgalvao commented 1 year ago

If we try to generalize a bit:

Input: revision, [artifact name] optionally, [version or stage] optionally Output: artifact version or stage

I like that the artifact name is optional. Because all I need is the revision, and navigate the git repo to that revision to find the artifact (e.g. a model) version in the dvc.lock file.

$ gto find HEAD nn --version
v0.0.1
$ gto find HEAD nn --stage
dev
$ gto find HEAD nn
v0.0.1 in dev
$ gto find HEAD
nn v0.0.1 in dev
mymodel v1.2.3 in stage and prod

Having the v0.0.1 in dev is nice, but in my view, I'd aim towards getting the tag itself, with the option to exclude certain parts such as --stage. E.g.:

$ gto find HEAD nn --tag --version --exclude-stage
mymodel@v0.0.1

That would give me a clean model name to save to the database. And it makes it easy to find the model code with the tag, regardless of its stage - which from what I understand, stages don't map to different versions of the model, code and data; they just tell you "where" they have been deployed to.

@aguschin that was a late reply, my bad! Right now I am using a mix of coolname and git rev-parse HEAD to register models to the database, but I'd rather have it coordinated with gto and its tag naming scheme!

aguschin commented 1 year ago

Ok, I think I came up with a decent way to include this in gto show instead of adding a new command. Please see the example (after your case there is also another part of extending show with a similar logic - good for generalization ๐Ÿ˜„ ).

Please let me know what do you think)

$ git clone https://github.com/iterative/example-gto
$ cd example-gto
$ gto show
โ•’โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•คโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•คโ•โ•โ•โ•โ•โ•โ•โ•โ•คโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•คโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ••
โ”‚ name     โ”‚ latest   โ”‚ #dev   โ”‚ #prod   โ”‚ #staging   โ”‚
โ•žโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ก
โ”‚ churn    โ”‚ v3.1.1   โ”‚ v3.1.1 โ”‚ v3.0.0  โ”‚ v3.1.0     โ”‚
โ”‚ cv-class โ”‚ v0.1.13  โ”‚ -      โ”‚ -       โ”‚ -          โ”‚
โ”‚ segment  โ”‚ v0.4.1   โ”‚ v0.4.1 โ”‚ -       โ”‚ -          โ”‚
โ•˜โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•งโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•งโ•โ•โ•โ•โ•โ•โ•โ•โ•งโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•งโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•›
$ gto show nn:HEAD  # show versions for nn in HEAD
โ•’โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•คโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•คโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•คโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•คโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ••
โ”‚ artifact   โ”‚ version   โ”‚ stage   โ”‚ created_at          โ”‚ ref          โ”‚
โ•žโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ก
โ”‚ churn      โ”‚ v3.1.0    โ”‚ staging โ”‚ 2022-11-20 09:36:38 โ”‚ churn@v3.1.0 โ”‚
โ•˜โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•งโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•งโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•งโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•งโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•›
$ gto show nn:HEAD --ref
churn@v3.1.0
$ gto show :HEAD  # show all artifacts that have versions/assignments in HEAD
โ•’โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•คโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•คโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•คโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•คโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ••
โ”‚ artifact   โ”‚ version   โ”‚ stage   โ”‚ created_at          โ”‚ ref          โ”‚
โ•žโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ก
โ”‚ churn      โ”‚ v3.1.0    โ”‚ staging โ”‚ 2022-11-20 09:36:38 โ”‚ churn@v3.1.0 โ”‚
โ•˜โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•งโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•งโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•งโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•งโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•›
$ gto show @v3.1.0  # show all v3.1.0 versions for all artifacts
โ•’โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•คโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•คโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•คโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•คโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ••
โ”‚ artifact   โ”‚ version   โ”‚ stage   โ”‚ created_at          โ”‚ ref          โ”‚
โ•žโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ก
โ”‚ churn      โ”‚ v3.1.0    โ”‚ staging โ”‚ 2022-11-20 09:36:38 โ”‚ churn@v3.1.0 โ”‚
โ•˜โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•งโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•งโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•งโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•งโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•›
$ gto show "#staging"  # show all versions in staging for all artifacts
โ•’โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•คโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•คโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•คโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•คโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ••
โ”‚ artifact   โ”‚ version   โ”‚ stage   โ”‚ created_at          โ”‚ ref          โ”‚
โ•žโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ก
โ”‚ churn      โ”‚ v3.1.0    โ”‚ staging โ”‚ 2022-11-20 09:36:38 โ”‚ churn@v3.1.0 โ”‚
โ•˜โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•งโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•งโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•งโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•งโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•›
$ gto show --long  # alternative way to show the registry
โ•’โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•คโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•คโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•คโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•คโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ••
โ”‚ artifact   โ”‚ version   โ”‚ stage   โ”‚ created_at          โ”‚ ref          โ”‚
โ•žโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ก
โ”‚ churn      โ”‚ v3.1.1    โ”‚ dev     โ”‚ 2022-11-27 08:16:38 โ”‚ churn@v3.1.1 โ”‚
โ”‚ churn      โ”‚ v3.1.0    โ”‚ staging โ”‚ 2022-11-20 09:36:38 โ”‚ churn@v3.1.0 โ”‚
โ”‚ churn      โ”‚ v3.0.0    โ”‚ prod    โ”‚ 2022-11-15 18:29:58 โ”‚ churn@v3.0.0 โ”‚
โ”‚ cv-class   โ”‚ v0.1.13   โ”‚         โ”‚ 2022-11-18 02:03:18 โ”‚ cv-class@v0.1.13 โ”‚
โ”‚ cv-class   โ”‚ 793ff78   โ”‚         โ”‚ 2022-11-19 05:49:58 โ”‚ 793ff78          โ”‚
โ”‚ segment    โ”‚ v0.4.1    โ”‚ dev     โ”‚ 2022-11-16 22:16:38 โ”‚ segment@v0.4.1 โ”‚
โ”‚ segment    โ”‚ 793ff78   โ”‚         โ”‚ 2022-11-19 05:49:58 โ”‚ 793ff78        โ”‚
โ•˜โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•งโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•งโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•งโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•งโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•›

For the record, this design doesn't allow to get Git tags that assigned stages in HEAD (all or for a specific artifact), but that's not required currently and can be ignored I assume.

bgalvao commented 1 year ago

For the record, this design doesn't allow to get Git tags that assigned stages in HEAD (all or for a specific artifact), but that's not required currently and can be ignored I assume.

Well it is good enough for my use case: I only need a gto controlled version-tag so that it matches predictions I am saving in the database with the tags in the git repository.

So, on my end, I'd say this is a good addition for me already ๐Ÿ˜„

bgalvao commented 1 year ago

@aguschin hey, I would like to request to have this accesible from Python, if possible :D