iterative / dvc

🦉 Data Versioning and ML Experiments
https://dvc.org
Apache License 2.0
13.96k stars 1.19k forks source link

exp push: HangupException when attempting to push an experiment to a codecommit repo #9416

Open mattlbeck opened 1 year ago

mattlbeck commented 1 year ago

Bug Report

Description

I have a codecommit repo and the remote url follows the form https://git-codecommit.<region>.amazonaws.com/v1/repos/<reponame>

A dvc exp push https://git-codecommit.<region>.amazonaws.com/v1/repos/<reponame> balky-chin -vv results in the error and traceack below.

I can workaround this by pushing the fully resolved experiment ref given in the debug using git

git push origin refs/exps/21/b9f0518f32301a128f5c3927fed1c308401b08/balky-chin

After that, commands like dvc exp list and dvc exp pull work fine against the remote.

Not sure what's changed since I helped resolve #9007 and I think its exactly the same scenario...

2023-05-07 20:12:35,563 DEBUG: v2.56.0 (pip), CPython 3.9.16 on Linux-5.10.76-linuxkit-x86_64-with-glibc2.31
2023-05-07 20:12:35,563 DEBUG: command: /opt/conda/bin/dvc exp push https://git-codecommit.eu-west-2.amazonaws.com/v1/repos/example-devbox-project balky-chin -vv
2023-05-07 20:12:35,564 TRACE: Namespace(cprofile=False, yappi=False, yappi_separate_threads=False, viztracer=False, viztracer_depth=None, viztracer_async=False, cprofile_dump=None, pdb=False, instrument=False, instrument_open=False, show_stack=False, quiet=0, verbose=2, cd='.', cmd='push', all_commits=False, rev=None, num=1, force=False, push_cache=True, dvc_remote=None, jobs=None, run_cache=False, git_remote='https://git-codecommit.eu-west-2.amazonaws.com/v1/repos/example-devbox-project', experiment=['balky-chin'], func=<class 'dvc.commands.experiments.push.CmdExperimentsPush'>, parser=DvcParser(prog='dvc', usage=None, description='Data Version Control', formatter_class=<class 'argparse.RawTextHelpFormatter'>, conflict_handler='error', add_help=False))
2023-05-07 20:12:35,851 DEBUG: git push experiment ['refs/exps/21/b9f0518f32301a128f5c3927fed1c308401b08/balky-chin:refs/exps/21/b9f0518f32301a128f5c3927fed1c308401b08/balky-chin'] -> 'https://git-codecommit.eu-west-2.amazonaws.com/v1/repos/example-devbox-project'
2023-05-07 20:12:38,325 ERROR: unexpected error - The remote server unexpectedly closed the connection.                                                                                                                   
Traceback (most recent call last):
  File "/opt/conda/lib/python3.9/site-packages/dvc/cli/__init__.py", line 210, in main
    ret = cmd.do_run()
  File "/opt/conda/lib/python3.9/site-packages/dvc/cli/command.py", line 26, in do_run
    return self.run()
  File "/opt/conda/lib/python3.9/site-packages/dvc/commands/experiments/push.py", line 65, in run
    result = self.repo.experiments.push(
  File "/opt/conda/lib/python3.9/site-packages/dvc/repo/experiments/__init__.py", line 494, in push
    return push(self.repo, *args, **kwargs)
  File "/opt/conda/lib/python3.9/site-packages/dvc/repo/__init__.py", line 65, in wrapper
    return f(repo, *args, **kwargs)
  File "/opt/conda/lib/python3.9/site-packages/dvc/repo/scm_context.py", line 151, in run
    return method(repo, *args, **kw)
  File "/opt/conda/lib/python3.9/site-packages/dvc/repo/experiments/push.py", line 117, in push
    push_result = _push(repo, git_remote, exp_ref_set, force)
  File "/opt/conda/lib/python3.9/site-packages/dvc/repo/experiments/push.py", line 159, in _push
    results: Mapping[str, SyncStatus] = repo.scm.push_refspecs(
  File "/opt/conda/lib/python3.9/site-packages/scmrepo/git/__init__.py", line 286, in _backend_func
    result = func(*args, **kwargs)
  File "/opt/conda/lib/python3.9/site-packages/scmrepo/git/backend/dulwich/__init__.py", line 580, in push_refspecs
    result = client.send_pack(
  File "/opt/conda/lib/python3.9/site-packages/dulwich/client.py", line 2054, in send_pack
    ref_status = self._handle_receive_pack_tail(
  File "/opt/conda/lib/python3.9/site-packages/dulwich/client.py", line 881, in _handle_receive_pack_tail
    for chan, data in _read_side_band64k_data(proto.read_pkt_seq()):
  File "/opt/conda/lib/python3.9/site-packages/dulwich/client.py", line 458, in _read_side_band64k_data
    for pkt in pkt_seq:
  File "/opt/conda/lib/python3.9/site-packages/dulwich/protocol.py", line 271, in read_pkt_seq
    pkt = self.read_pkt_line()
  File "/opt/conda/lib/python3.9/site-packages/dulwich/protocol.py", line 214, in read_pkt_line
    raise HangupException()
dulwich.errors.HangupException: The remote server unexpectedly closed the connection.

Environment information

DVC version: 2.56.0 (pip)
-------------------------
Platform: Python 3.9.16 on Linux-5.10.76-linuxkit-x86_64-with-glibc2.31
Subprojects:
        dvc_data = 0.47.1
        dvc_objects = 0.21.1
        dvc_render = 0.3.1
        dvc_task = 0.2.1
        scmrepo = 1.0.1
Supports:
        http (aiohttp = 3.8.4, aiohttp-retry = 2.8.3),
        https (aiohttp = 3.8.4, aiohttp-retry = 2.8.3),
        s3 (s3fs = 2023.4.0, boto3 = 1.26.76)
Config:
        Global: /home/mambauser/.config/dvc
        System: /etc/xdg/dvc
Cache types: symlink
Cache directory: fuse.grpcfuse on grpcfuse
Caches: local
Remotes: s3
Workspace directory: fuse.grpcfuse on grpcfuse
Repo: dvc, git
Repo.site_cache_dir: /var/tmp/dvc/repo/d5d9f9155688caf9fa42340dcf3f255e
pmrowla commented 1 year ago

Can you try updating scmrepo to the latest release and see if the issue is still present?

pip install scmrepo==1.0.3
mattlbeck commented 1 year ago

@pmrowla That gives me a different error

Nevermind - my authentication was slightly off. The error is the same 🙃

mattlbeck commented 1 year ago

The error has since changed to

dvc exp push origin tipsy-more -v
2023-09-06 19:29:41,621 DEBUG: v3.18.0 (pip), CPython 3.9.16 on Linux-5.10.167-147.601.amzn2.x86_64-x86_64-with-glibc2.31
2023-09-06 19:29:41,621 DEBUG: command: /opt/conda/bin/dvc exp push origin tipsy-more -v
2023-09-06 19:29:42,055 DEBUG: git push experiment ['refs/exps/0c/577e867827d051c7328bcd2c90a997a9002065/tipsy-more:refs/exps/0c/577e867827d051c7328bcd2c90a997a9002065/tipsy-more'] -> 'origin'
2023-09-06 19:29:42,293 ERROR: unexpected error - [Errno -2] Name or service not known                                                                                                                                      
Traceback (most recent call last):
  File "/opt/conda/lib/python3.9/site-packages/dvc/cli/__init__.py", line 209, in main
    ret = cmd.do_run()
  File "/opt/conda/lib/python3.9/site-packages/dvc/cli/command.py", line 26, in do_run
    return self.run()
  File "/opt/conda/lib/python3.9/site-packages/dvc/commands/experiments/push.py", line 55, in run
    result = self.repo.experiments.push(
  File "/opt/conda/lib/python3.9/site-packages/dvc/repo/experiments/__init__.py", line 381, in push
    return push(self.repo, *args, **kwargs)
  File "/opt/conda/lib/python3.9/site-packages/dvc/repo/__init__.py", line 61, in wrapper
    return f(repo, *args, **kwargs)
  File "/opt/conda/lib/python3.9/site-packages/dvc/repo/scm_context.py", line 151, in run
    return method(repo, *args, **kw)
  File "/opt/conda/lib/python3.9/site-packages/dvc/repo/experiments/push.py", line 120, in push
    push_result = _push(repo, git_remote, exp_ref_set, force)
  File "/opt/conda/lib/python3.9/site-packages/dvc/repo/experiments/push.py", line 162, in _push
    results: Mapping[str, SyncStatus] = repo.scm.push_refspecs(
  File "/opt/conda/lib/python3.9/site-packages/scmrepo/git/__init__.py", line 292, in _backend_func
    result = func(*args, **kwargs)
  File "/opt/conda/lib/python3.9/site-packages/scmrepo/git/backend/dulwich/__init__.py", line 630, in push_refspecs
    result = client.send_pack(
  File "/opt/conda/lib/python3.9/site-packages/dulwich/client.py", line 1032, in send_pack
    proto, unused_can_read, stderr = self._connect(b"receive-pack", path)
  File "/opt/conda/lib/python3.9/site-packages/dulwich/client.py", line 1772, in _connect
    con = self.ssh_vendor.run_command(
  File "/opt/conda/lib/python3.9/site-packages/fsspec/asyn.py", line 115, in wrapper
    return sync(self.loop, func, *args, **kwargs)
  File "/opt/conda/lib/python3.9/site-packages/fsspec/asyn.py", line 100, in sync
    raise return_result
  File "/opt/conda/lib/python3.9/site-packages/fsspec/asyn.py", line 55, in _runner
    result[0] = await coro
  File "/opt/conda/lib/python3.9/site-packages/scmrepo/git/backend/dulwich/asyncssh_vendor.py", line 295, in _run_command
    conn = await asyncssh.connect(
  File "/opt/conda/lib/python3.9/site-packages/asyncssh/connection.py", line 8042, in connect
    return await asyncio.wait_for(
  File "/opt/conda/lib/python3.9/asyncio/tasks.py", line 442, in wait_for
    return await fut
  File "/opt/conda/lib/python3.9/site-packages/asyncssh/connection.py", line 430, in _connect
    _, session = await loop.create_connection(
  File "/opt/conda/lib/python3.9/asyncio/base_events.py", line 1026, in create_connection
    infos = await self._ensure_resolved(
  File "/opt/conda/lib/python3.9/asyncio/base_events.py", line 1405, in _ensure_resolved
    return await loop.getaddrinfo(host, port, family=family, type=type,
  File "/opt/conda/lib/python3.9/asyncio/base_events.py", line 861, in getaddrinfo
    return await self.run_in_executor(
  File "/opt/conda/lib/python3.9/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/opt/conda/lib/python3.9/socket.py", line 954, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno -2] Name or service not known

2023-09-06 19:29:42,350 DEBUG: Version info for developers:
DVC version: 3.18.0 (pip)
-------------------------
Platform: Python 3.9.16 on Linux-5.10.167-147.601.amzn2.x86_64-x86_64-with-glibc2.31
Subprojects:
        dvc_data = 2.15.4
        dvc_objects = 1.0.1
        dvc_render = 0.3.1
        dvc_task = 0.3.0
        scmrepo = 1.3.1
Supports:
        http (aiohttp = 3.8.4, aiohttp-retry = 2.8.3),
        https (aiohttp = 3.8.4, aiohttp-retry = 2.8.3),
        s3 (s3fs = 2023.4.0, boto3 = 1.26.76)
Config:
        Global: /home/mambauser/.config/dvc
        System: /etc/xdg/dvc
Cache types: <https://error.dvc.org/no-dvc-cache>
Caches: local
Remotes: s3
Workspace directory: xfs on /dev/nvme0n1p1
Repo: dvc, git
Repo.site_cache_dir: /var/tmp/dvc/repo/7610bf67dd6fdb8d054b84b2bd7a7b14

Having any troubles? Hit us up at https://dvc.org/support, we are always happy to help!
2023-09-06 19:29:42,354 DEBUG: Analytics is enabled.
2023-09-06 19:29:42,415 DEBUG: Trying to spawn '['daemon', '-q', 'analytics', '/tmp/tmpizgnjgns']'
2023-09-06 19:29:42,417 DEBUG: Spawned '['daemon', '-q', 'analytics', '/tmp/tmpizgnjgns']'
liamquantrill commented 7 months ago

Hi, was this ever solved? I am getting the same HangupException as in the original issue raised in this thread, and I am also using AWS CodeCommit as my remote. Please see my traceback below:

2024-04-11 14:37:43,982 DEBUG: v3.49.0 (pip), CPython 3.10.12 on Linux-6.5.0-1016-aws-x86_64-with-glibc2.35
2024-04-11 14:37:43,982 DEBUG: command: /home/ubuntu/Documents/dvc_test_env/bin/dvc exp push origin -A -v
2024-04-11 14:37:44,269 DEBUG: git push experiment ['refs/exps/f3/14cbc83b34202b889177262d0f248f58a0f6a0/ashen-leas:refs/exps/f3/14cbc83b34202b889177262d0f248f58a0f6a0/ashen-leas', 'refs/exps/f3/14cbc83b34202b889177262d0f248f58a0f6a0/treed-dabs:refs/exps/f3/14cbc83b34202b889177262d0f248f58a0f6a0/treed-dabs', 'refs/exps/f3/14cbc83b34202b889177262d0f248f58a0f6a0/blear-doxy:refs/exps/f3/14cbc83b34202b889177262d0f248f58a0f6a0/blear-doxy', 'refs/exps/f3/14cbc83b34202b889177262d0f248f58a0f6a0/tabby-soft:refs/exps/f3/14cbc83b34202b889177262d0f248f58a0f6a0/tabby-soft', 'refs/exps/f3/14cbc83b34202b889177262d0f248f58a0f6a0/slimy-repp:refs/exps/f3/14cbc83b34202b889177262d0f248f58a0f6a0/slimy-repp', 'refs/exps/f3/14cbc83b34202b889177262d0f248f58a0f6a0/whity-wage:refs/exps/f3/14cbc83b34202b889177262d0f248f58a0f6a0/whity-wage', 'refs/exps/f3/14cbc83b34202b889177262d0f248f58a0f6a0/lathy-work:refs/exps/f3/14cbc83b34202b889177262d0f248f58a0f6a0/lathy-work', 'refs/exps/f3/14cbc83b34202b889177262d0f248f58a0f6a0/cedar-aura:refs/exps/f3/14cbc83b34202b889177262d0f248f58a0f6a0/cedar-aura', 'refs/exps/f3/14cbc83b34202b889177262d0f248f58a0f6a0/tough-jato:refs/exps/f3/14cbc83b34202b889177262d0f248f58a0f6a0/tough-jato', 'refs/exps/98/558612ded7fb57ebf0bc1837b3adbe76c32fc9/boozy-aqua:refs/exps/98/558612ded7fb57ebf0bc1837b3adbe76c32fc9/boozy-aqua', 'refs/exps/f3/14cbc83b34202b889177262d0f248f58a0f6a0/spiny-sash:refs/exps/f3/14cbc83b34202b889177262d0f248f58a0f6a0/spiny-sash', 'refs/exps/f3/14cbc83b34202b889177262d0f248f58a0f6a0/tacit-walk:refs/exps/f3/14cbc83b34202b889177262d0f248f58a0f6a0/tacit-walk', 'refs/exps/f3/14cbc83b34202b889177262d0f248f58a0f6a0/regal-food:refs/exps/f3/14cbc83b34202b889177262d0f248f58a0f6a0/regal-food', 'refs/exps/98/558612ded7fb57ebf0bc1837b3adbe76c32fc9/boozy-aqua:refs/exps/98/558612ded7fb57ebf0bc1837b3adbe76c32fc9/boozy-aqua', 'refs/exps/f3/14cbc83b34202b889177262d0f248f58a0f6a0/amber-sinh:refs/exps/f3/14cbc83b34202b889177262d0f248f58a0f6a0/amber-sinh'] -> 'origin'
2024-04-11 14:37:47,081 ERROR: unexpected error - The remote server unexpectedly closed the connection.
Traceback (most recent call last):
  File "/home/ubuntu/Documents/dvc_test_env/lib/python3.10/site-packages/dvc/cli/__init__.py", line 211, in main
    ret = cmd.do_run()
  File "/home/ubuntu/Documents/dvc_test_env/lib/python3.10/site-packages/dvc/cli/command.py", line 27, in do_run
    return self.run()
  File "/home/ubuntu/Documents/dvc_test_env/lib/python3.10/site-packages/dvc/commands/experiments/push.py", line 54, in run
    result = self.repo.experiments.push(
  File "/home/ubuntu/Documents/dvc_test_env/lib/python3.10/site-packages/dvc/repo/experiments/__init__.py", line 364, in push
    return push(self.repo, *args, **kwargs)
  File "/home/ubuntu/Documents/dvc_test_env/lib/python3.10/site-packages/dvc/repo/__init__.py", line 58, in wrapper
    return f(repo, *args, **kwargs)
  File "/home/ubuntu/Documents/dvc_test_env/lib/python3.10/site-packages/dvc/repo/scm_context.py", line 143, in run
    return method(repo, *args, **kw)
  File "/home/ubuntu/Documents/dvc_test_env/lib/python3.10/site-packages/dvc/repo/experiments/push.py", line 111, in push
    push_result = _push(repo, git_remote, exp_ref_set, force)
  File "/home/ubuntu/Documents/dvc_test_env/lib/python3.10/site-packages/dvc/repo/experiments/push.py", line 153, in _push
    results: Mapping[str, SyncStatus] = repo.scm.push_refspecs(
  File "/home/ubuntu/Documents/dvc_test_env/lib/python3.10/site-packages/scmrepo/git/__init__.py", line 286, in _backend_func
    result = func(*args, **kwargs)
  File "/home/ubuntu/Documents/dvc_test_env/lib/python3.10/site-packages/scmrepo/git/backend/dulwich/__init__.py", line 580, in push_refspecs
    result = client.send_pack(
  File "/home/ubuntu/Documents/dvc_test_env/lib/python3.10/site-packages/dulwich/client.py", line 2125, in send_pack
    ref_status = self._handle_receive_pack_tail(
  File "/home/ubuntu/Documents/dvc_test_env/lib/python3.10/site-packages/dulwich/client.py", line 934, in _handle_receive_pack_tail
    for chan, data in _read_side_band64k_data(proto.read_pkt_seq()):
  File "/home/ubuntu/Documents/dvc_test_env/lib/python3.10/site-packages/dulwich/client.py", line 501, in _read_side_band64k_data
    for pkt in pkt_seq:
  File "/home/ubuntu/Documents/dvc_test_env/lib/python3.10/site-packages/dulwich/protocol.py", line 272, in read_pkt_seq
    pkt = self.read_pkt_line()
  File "/home/ubuntu/Documents/dvc_test_env/lib/python3.10/site-packages/dulwich/protocol.py", line 215, in read_pkt_line
    raise HangupException
dulwich.errors.HangupException: The remote server unexpectedly closed the connection.

2024-04-11 14:37:47,114 DEBUG: link type reflink is not available ([Errno 95] no more link types left to try out)
2024-04-11 14:37:47,114 DEBUG: Removing '/home/ubuntu/Documents/.1z_O88b4VkNCTog5XFt7rA.tmp'
2024-04-11 14:37:47,114 DEBUG: Removing '/home/ubuntu/Documents/.1z_O88b4VkNCTog5XFt7rA.tmp'
2024-04-11 14:37:47,114 DEBUG: Removing '/home/ubuntu/Documents/.1z_O88b4VkNCTog5XFt7rA.tmp'
2024-04-11 14:37:47,114 DEBUG: Removing '/home/ubuntu/Documents/raids-u109-castle-repo/.dvc/cache/files/md5/.DRAAHvmXPA3njbJbHbpYYQ.tmp'
2024-04-11 14:37:47,125 DEBUG: Version info for developers:
DVC version: 3.49.0 (pip)
-------------------------
Platform: Python 3.10.12 on Linux-6.5.0-1016-aws-x86_64-with-glibc2.35
Subprojects:
    dvc_data = 3.15.1
    dvc_objects = 5.1.0
    dvc_render = 1.0.1
    dvc_task = 0.4.0
    scmrepo = 1.0.3
Supports:
    http (aiohttp = 3.9.3, aiohttp-retry = 2.8.3),
    https (aiohttp = 3.9.3, aiohttp-retry = 2.8.3),
    s3 (s3fs = 2024.3.1, boto3 = 1.34.51)
Config:
    Global: /home/ubuntu/.config/dvc
    System: /etc/xdg/dvc
Cache types: hardlink, symlink
Cache directory: ext4 on /dev/nvme0n1p1
Caches: local
Remotes: s3
Workspace directory: ext4 on /dev/nvme0n1p1
Repo: dvc, git
Repo.site_cache_dir: /var/tmp/dvc/repo/c615bcd70e1f35acafcebbca7063e140