Open igordertigor opened 1 year ago
Ok, seems that I found out how this should work. Might still be a good idea to have in the documentation:
export GITHUB_USERNAME=igordertigor
export GITHUB_TOKEN=$(gh auth token)
mlem link --rev feature-engine@v0.0.2 --source-project="https://github.com/myorganization/myproject/" models/feature_engine feature_engine
Not sure where a good place would be to add that to the documentation as it kind of refers to all commands as long as they refer to a remote repository in some way.
Wow, I see you found it out already. We have one place in docs we explain this problem, but it definitely should be somewhere else as well. https://mlem.ai/doc/user-guide/dvc/#working-with-private-repositories
I think we simply need it as a separate section in User Guide. It can be called "Working with Git".
Besides, I think we can mention it in few places like https://mlem.ai/doc/user-guide/linking/
Finally, we can add more meaningful error, e.g.
$ mlem link --rev feature-engine@v0.0.2 --source-project="https://github.com/myorganization/myproject/" models/feature-engine feature-engine
❌ Revision 'feature-engine@v0.0.2' wasn't found in path=https://github.com/myorganization/myproject,
fs=None. If the repo is private, please check you've set GITHUB_USERNAME and GITHUB_TOKEN
This was my first thought. And second thought: we have this issue that should support authorization like DVC does it. https://github.com/iterative/mlem/issues/616
@igordertigor, I'm happy to help you with contributing this in either docs or MLEM codebase, especially if you'd decide to help MLEM support ssh
auth.
Thank you @aguschin for your quick reply. Here are a few thoughts:
mlem.contrib.github
, it seems that the whole http github interface is very much baked into the GithubResolver
. I feel that ssh auth would need to re-implement the full UriResolver. However, the nice part might be that it would not potentially work with every git repository. Does that sound like I'm correctly rephrasing the problem? I could probably spare a few hours for this but not necessarily a few days. Let me know what you think @aguschin.@igordertigor thanks!
CloudGitResolver
which will work for any GH provider. We can probably drop what we currently have implemented then.Re 2.-3. -- if you feel like contributing 5. will be too much, making PRs for these items should be fast enough and will fix it. I'll get to 5. once I have time then (not soon unfortunately).
- I guess the best bet here would be to adapt DVC's approach https://github.com/iterative/dvc/blob/main/dvc/external_repo.py . This should require implementing something on top of
CloudGitResolver
which will work for any GH provider. We can probably drop what we currently have implemented then.
As far as I can see, dvc/external_repo.py
clones the repo to a local tmp-dir. I'll take a closer look at this in the next days to decide if this seems like something I could do (given that you won't have time soon).
Re 2.-3. -- if you feel like contributing 5. will be too much, making PRs for these items should be fast enough and will fix it. I'll get to 5. once I have time then (not soon unfortunately).
I'm not super eager to contribute, but I feel that mlem being open source, allows me to fix things my self if I really need them fixed rather than waiting for others who may not need them fixed. I hope that makes sense. That doesn't mean I would under no circumstances fix error messages or documentation, just a matter of where I put my work time.
I keep having trouble with the
--rev
flag on commands likemlem link
: What exactly is accepted as a revision? I would expect things like branch names, tags (not necessarily annotated) and commit-shas. Strangely, it seems that none of those is actually working. In any case, I get (illustrating here for a tag, which is what I would prefer):I feel that omitting the
--rev
is a bad idea, because the target of the link would not be immutable. However, I tried that for testing:I usually prefer ssh when interacting with git, but I re-authenticated with the github commandline app
gh
and set my preferred protocol tohttps
to avoid potential issues with authentication. After all, this is trying to access a private repository. Either this is a bug, or there is something missing from the documentation. I'm happy to help with either of those.