iterative / gto

🏷️ Git Tag Ops. Turn your Git repository into Artifact Registry or Model Registry.
https://dvc.org/doc/gto
Apache License 2.0
140 stars 16 forks source link

bug: gto does not work with ssh-based repository urls #431

Closed zactivate closed 10 months ago

zactivate commented 10 months ago

Using a repository's HTTPS url works as expected:

gto show --repo https://github.com/iterative/example-gto.git
╒══════════╤══════════╤════════╤═════════╤════════════╕
│ name     │ latest   │ #dev   │ #prod   │ #staging   │
╞══════════╪══════════╪════════╪═════════╪════════════╡
│ churn    │ v3.1.0   │ v3.1.0 │ v3.0.0  │ v3.1.0     │
│ cv-class │ v0.1.13  │ -      │ -       │ -          │
│ segment  │ v0.4.1   │ v0.4.1 │ -       │ -          │
╘══════════╧══════════╧════════╧═════════╧════════════╛

However, using the SSH url fails after three passphrase requests:

gto show --repo git@github.com:iterative/example-gto.git
Unexpected error: Failed to clone repo 'git@github.com:iterative/example-gto.git to ...'

Note that this is with GTO version 1.5.0

shcheklein commented 10 months ago

@zactivate could you please try to run dvc get -v ... for the same repo, any file. Both products use the same scmrepo layer to do Git operations, DVC is easier to get more information about the failure from. Thanks.

zactivate commented 10 months ago

@shcheklein -- Here you go:

dvc get -v git@github.com:iterative/example-gto.git .gitignore
2023-12-04 14:39:52,504 DEBUG: v3.14.0 (pip), CPython 3.11.4 on macOS-13.5.2-arm64-arm-64bit
2023-12-04 14:39:52,504 DEBUG: command: /Users/zduey/Documents/act-data-research/shared/.venv/bin/dvc get -v git@github.com:iterative/example-gto.git .gitignore
2023-12-04 14:39:53,178 DEBUG: Creating external repo git@github.com:iterative/example-gto.git@None
2023-12-04 14:39:53,178 DEBUG: erepo: git clone 'git@github.com:iterative/example-gto.git' to a temporary dir
Cloning example-gto.git|                                                                                                                                                                                            |0.00/? [00:00,      ?obj/s]Enter passphrase for key '/Users/zduey/.ssh/id_ed25519': 
Enter passphrase for key '/Users/zduey/.ssh/id_ed25519': 
Enter passphrase for key '/Users/zduey/.ssh/id_ed25519': 
2023-12-04 14:39:58,897 ERROR: failed to get '.gitignore' - SCM error: Failed to clone repo 'git@github.com:iterative/example-gto.git' to '/var/folders/4_/ytlgxgx17yv30h1twf443drc0000gp/T/tmp6_jxgbpsdvc-clone': Authentication failed for: 'git@github.com:22': Permission denied
Traceback (most recent call last):
  File "/Users/zduey/Documents/act-data-research/shared/.venv/lib/python3.11/site-packages/scmrepo/git/backend/dulwich/asyncssh_vendor.py", line 295, in _run_command
    conn = await asyncssh.connect(
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/zduey/Documents/act-data-research/shared/.venv/lib/python3.11/site-packages/asyncssh/connection.py", line 8093, in connect
    return await asyncio.wait_for(
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/python@3.11/3.11.4_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/asyncio/tasks.py", line 442, in wait_for
    return await fut
           ^^^^^^^^^
  File "/Users/zduey/Documents/act-data-research/shared/.venv/lib/python3.11/site-packages/asyncssh/connection.py", line 440, in _connect
    await options.waiter
asyncssh.misc.PermissionDenied: Permission denied

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/zduey/Documents/act-data-research/shared/.venv/lib/python3.11/site-packages/scmrepo/git/backend/dulwich/__init__.py", line 227, in clone
    repo = clone_from()
           ^^^^^^^^^^^^
  File "/Users/zduey/Documents/act-data-research/shared/.venv/lib/python3.11/site-packages/dulwich/porcelain.py", line 542, in clone
    return client.clone(
           ^^^^^^^^^^^^^
  File "/Users/zduey/Documents/act-data-research/shared/.venv/lib/python3.11/site-packages/dulwich/client.py", line 738, in clone
    result = self.fetch(path, target, progress=progress, depth=depth)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/zduey/Documents/act-data-research/shared/.venv/lib/python3.11/site-packages/dulwich/client.py", line 816, in fetch
    result = self.fetch_pack(
             ^^^^^^^^^^^^^^^^
  File "/Users/zduey/Documents/act-data-research/shared/.venv/lib/python3.11/site-packages/dulwich/client.py", line 1127, in fetch_pack
    proto, can_read, stderr = self._connect(b"upload-pack", path)
                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/zduey/Documents/act-data-research/shared/.venv/lib/python3.11/site-packages/dulwich/client.py", line 1772, in _connect
    con = self.ssh_vendor.run_command(
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/zduey/Documents/act-data-research/shared/.venv/lib/python3.11/site-packages/fsspec/asyn.py", line 121, in wrapper
    return sync(self.loop, func, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/zduey/Documents/act-data-research/shared/.venv/lib/python3.11/site-packages/fsspec/asyn.py", line 106, in sync
    raise return_result
  File "/Users/zduey/Documents/act-data-research/shared/.venv/lib/python3.11/site-packages/fsspec/asyn.py", line 61, in _runner
    result[0] = await coro
                ^^^^^^^^^^
  File "/Users/zduey/Documents/act-data-research/shared/.venv/lib/python3.11/site-packages/scmrepo/git/backend/dulwich/asyncssh_vendor.py", line 308, in _run_command
    raise AuthError(f"{username}@{host}:{port or 22}") from exc
scmrepo.exceptions.AuthError: Authentication failed for: 'git@github.com:22'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/zduey/Documents/act-data-research/shared/.venv/lib/python3.11/site-packages/dvc/scm.py", line 160, in clone
    git = Git.clone(url, to_path, progress=pbar.update_git, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/zduey/Documents/act-data-research/shared/.venv/lib/python3.11/site-packages/scmrepo/git/__init__.py", line 148, in clone
    backend.clone(url, to_path, bare=bare, mirror=mirror, **kwargs)
  File "/Users/zduey/Documents/act-data-research/shared/.venv/lib/python3.11/site-packages/scmrepo/git/backend/dulwich/__init__.py", line 235, in clone
    raise CloneError(url, to_path) from exc
scmrepo.exceptions.CloneError: Failed to clone repo 'git@github.com:iterative/example-gto.git' to '/var/folders/4_/ytlgxgx17yv30h1twf443drc0000gp/T/tmp6_jxgbpsdvc-clone'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/zduey/Documents/act-data-research/shared/.venv/lib/python3.11/site-packages/dvc/commands/get.py", line 33, in _get_file_from_repo
    Repo.get(
  File "/Users/zduey/Documents/act-data-research/shared/.venv/lib/python3.11/site-packages/dvc/repo/get.py", line 45, in get
    with Repo.open(
         ^^^^^^^^^^
  File "/Users/zduey/Documents/act-data-research/shared/.venv/lib/python3.11/site-packages/dvc/repo/__init__.py", line 308, in open
    return open_repo(url, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/zduey/Documents/act-data-research/shared/.venv/lib/python3.11/site-packages/dvc/repo/open_repo.py", line 64, in open_repo
    return _external_repo(url, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/python@3.11/3.11.4_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/contextlib.py", line 81, in inner
    return func(*args, **kwds)
           ^^^^^^^^^^^^^^^^^^^
  File "/Users/zduey/Documents/act-data-research/shared/.venv/lib/python3.11/site-packages/dvc/repo/open_repo.py", line 27, in _external_repo
    path = _cached_clone(url, rev)
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/zduey/Documents/act-data-research/shared/.venv/lib/python3.11/site-packages/dvc/repo/open_repo.py", line 138, in _cached_clone
    clone_path, shallow = _clone_default_branch(url, rev)
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/zduey/Documents/act-data-research/shared/.venv/lib/python3.11/site-packages/funcy/decorators.py", line 47, in wrapper
    return deco(call, *dargs, **dkwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/zduey/Documents/act-data-research/shared/.venv/lib/python3.11/site-packages/funcy/flow.py", line 246, in wrap_with
    return call()
           ^^^^^^
  File "/Users/zduey/Documents/act-data-research/shared/.venv/lib/python3.11/site-packages/funcy/decorators.py", line 68, in __call__
    return self._func(*self._args, **self._kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/zduey/Documents/act-data-research/shared/.venv/lib/python3.11/site-packages/dvc/repo/open_repo.py", line 202, in _clone_default_branch
    git = clone(url, clone_path)
          ^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/zduey/Documents/act-data-research/shared/.venv/lib/python3.11/site-packages/dvc/scm.py", line 165, in clone
    raise CloneError("SCM error") from exc
dvc.scm.CloneError: SCM error

2023-12-04 14:39:58,956 DEBUG: Analytics is enabled.
2023-12-04 14:39:59,020 DEBUG: Trying to spawn '['daemon', '-q', 'analytics', '/var/folders/4_/ytlgxgx17yv30h1twf443drc0000gp/T/tmphnicpplv']'
2023-12-04 14:39:59,021 DEBUG: Spawned '['daemon', '-q', 'analytics', '/var/folders/4_/ytlgxgx17yv30h1twf443drc0000gp/T/tmphnicpplv']'

I validated that I can clone the same repo using git directly.

git clone git@github.com:iterative/example-gto.git
Cloning into 'example-gto'...
Enter passphrase for key '/Users/zduey/.ssh/id_ed25519': 
remote: Enumerating objects: 60, done.
remote: Counting objects: 100% (41/41), done.
remote: Compressing objects: 100% (24/24), done.
remote: Total 60 (delta 17), reused 33 (delta 14), pack-reused 19
Receiving objects: 100% (60/60), 8.59 KiB | 2.15 MiB/s, done.
Resolving deltas: 100% (17/17), done.
shcheklein commented 10 months ago

@zactivate good. At least it the same issue. So it's not GTO specific. Do you agent / keychain, is your SSH key password encrypted?

Could you try this please https://github.com/iterative/dvc/issues/7702#issuecomment-1304895509 ?

zactivate commented 10 months ago

@shcheklein -- My SSH keys are encrypted. The solution you linked to worked for me. Thanks for the link! Apologies that I didn't find it myself.