iterative / gto

🏷️ Git Tag Ops. Turn your Git repository into Artifact Registry or Model Registry.
https://dvc.org/doc/gto
Apache License 2.0
142 stars 16 forks source link

Using GTO Git tags in `requirements.txt` #339

Closed aguschin closed 1 year ago

aguschin commented 1 year ago

e.g.:

# reqs.txt
git+https://github.com/iterative/example-gto.git@churn@v3.1.1

then

$ pip install -r reqs.txt
Collecting git+https://github.com/iterative/example-gto.git@churn@v3.1.1 (from -r reqs.txt (line 1))
  Cloning https://github.com/iterative/example-gto.git@churn (to revision v3.1.1) to /private/var/folders/tv/l60j0x050p536g3bh8g2w1n80000gn/T/pip-req-build-p4ot4t5n
  Running command git clone --filter=blob:none --quiet https://github.com/iterative/example-gto.git@churn /private/var/folders/tv/l60j0x050p536g3bh8g2w1n80000gn/T/pip-req-build-p4ot4t5n
  fatal: unable to access 'https://github.com/iterative/example-gto.git@churn/': The requested URL returned error: 400
  error: subprocess-exited-with-error

  × git clone --filter=blob:none --quiet https://github.com/iterative/example-gto.git@churn /private/var/folders/tv/l60j0x050p536g3bh8g2w1n80000gn/T/pip-req-build-p4ot4t5n did not run successfully.
  │ exit code: 128
  ╰─> See above for output.

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

× git clone --filter=blob:none --quiet https://github.com/iterative/example-gto.git@churn /private/var/folders/tv/l60j0x050p536g3bh8g2w1n80000gn/T/pip-req-build-p4ot4t5n did not run successfully.
│ exit code: 128
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.

Wasn't able to find workarounds with quoting, adding backward slashes, adding refs/tags/..., using %40 instead of @, etc yet.

aguschin commented 1 year ago

@skshetry @efiop can you hint me towards a possible solution for this? Thanks!

efiop commented 1 year ago

Quick google gives me this https://github.com/pypa/pip/issues/10226 Seems like a significant gto problem, no immediate ideas how to solve it 🙁 Looks like someone will just have to go ahead and contribute a patch to pip.

dberenbaum commented 1 year ago

@aguschin Do we need to prioritize this and working with setuptools? If we need to change the format of the git tags, better to address that now.

aguschin commented 1 year ago

Agreed - I think we need to resolve this (or have a plan at least) before artifacts: release. Two related things, posting for visibility:

  1. Semicolon is not supported in Git tag name dvclive:models/path@v0.0.1
  2. Current name limitations in GTO
aguschin commented 1 year ago

@dberenbaum, I've discussed this with @omesser a while ago and the conclusion we reached (@omesser pls fix me if I'm wrong) is to allow to change @ and # to something else. This way this scenario will be enabled. I can implement this after releasing artifacts.

dberenbaum commented 1 year ago

Thanks for remembering to get back to this @aguschin! That sounds like a good idea, but should we also change the default since we already know @ is problematic? Does # have the same problems?

aguschin commented 1 year ago

Yes, # have the same problem. I guess, most of the symbols will have the same problems, if not here, then somewhere else. Thus keeping things consistent and not changing defaults looks appealing to me. But changing the defaults can be an option as well.

Also, TBH, I'm not quite sure yet how this will be supported. E.g. how we work with already existing Git tags in Studio? Do we process all alternatives that mark model versions (@ and e.g. _)? If a single commit marked by both mymodel@v0.0.1 and mymodel_v0.0.1, how do we deal with this? Etc. These questions need some thinking through which I unfortunately don't have capacity for right now :(

dberenbaum commented 1 year ago

Yes, # have the same problem. I guess, most of the symbols will have the same problems, if not here, then somewhere else.

= seems to work with pip at least and seems natural for version numbers. You are right that it could cause problems elsewhere, but why not at least use a default that works for common scenarios that we know of?

dberenbaum commented 1 year ago

I think we can close this since AFAIK we are using = now.