iterative / dvc-ssh

SSH/SFTP plugin for dvc
Apache License 2.0
1 stars 3 forks source link

fetch: does not work with ssh even though scp/rsync/sftp do. #43

Open Jalmenara opened 1 year ago

Jalmenara commented 1 year ago

Bug Report

Description

dvc fetch & pull commands do not work with ssh remote even though the same path works with scp, rsync and sftp+get

Reproduce

I do not see a clear way to make this problem reproducible. In fact, I tried to post this as a Discord question, but my description was too long for a message. Hopefully it can be understood properly. The situation is as follows:

My team and I are facing a problem when using dvc ssh/sftp tools. We are setting up a dvc remote shared by two companies to work on a project: clientA & contractorB (I belong to the latter). The remote is hosted by the clientA at a location accessible by us through ssh. There is an intermediate proxy server, but we have sorted that out using the ProxyJump option in ~/.ssh/config, like this:

Host proxy
    HostName proxy.clientA.com
    User pepito

Host destination
    HostName dest.clientA.com
    User pepito
    ProxyJump proxy

The dvc remote is stored at destination. We have also configured the ssh keys on the servers of clientA, so that no password prompts are needed.

On the dvc side, we have configured the remote in our repo with the following .dvc/config:

[core]
    remote = bin-remote
['remote "bin-remote"']
    url = ssh://destination:/work/projects/models-bin/

However, the dvc fetch/pull commands fail, with the following prompt:

ERROR: unexpected error - [Errno -2] Name or service not known

The first thing I did was ensuring that the paths were written correctly. For instance, I removed the : in the url of the .dvc/config, between "destination" and "/work":

    url = ssh://destination/work/projects/models-bin/

This did not solve the problem.

Next, to discard issues with the ssh/sftp connections, I decided to copy manually the /work/projects/models-bin/ folder from clientA to my contractorB's computer using three different methods: scp, rsync and sftp.

# Method 1 (works fine)
scp -r pepito@destination:/work/projects/models-bin/ .

# Method 2 (works fine)
mkdir models-bin
rsync -avul pepito@destination:/work/projects/models-bin/ models-bin/

# Method 3 (works fine)
sftp destination
cd /work/projects/models-bin/
lmkdir models-bin/
lcd models-bin
get -r *

The three methods work properly: the "dvc-like" files appear at contractorB side (e.g., models-bin/b0/26324c6904b2a9cb4b88d6d61c81d1). This is what led me to think that the issue might be on the dvc side, and not so much on the connection or the paths. Note that the path used in the three manual methods is exactly the same (copy-pasted).

Expected

Be able to locate the files.

Environment information

Output of dvc doctor:

DVC version: 2.57.2 (pip)
-------------------------
Platform: Python 3.8.15 on Linux-5.10.0-0.bpo.7-amd64-x86_64-with-glibc2.10
Subprojects:
        dvc_data = 0.51.0
        dvc_objects = 0.22.0
        dvc_render = 0.4.0
        dvc_task = 0.2.1
        scmrepo = 1.0.3
Supports:
        http (aiohttp = 3.8.3, aiohttp-retry = 2.8.3),
        https (aiohttp = 3.8.3, aiohttp-retry = 2.8.3),
        ssh (sshfs = 2023.4.1),
        webdav (webdav4 = 0.9.8),
        webdavs (webdav4 = 0.9.8)
Config:
        Global: /home/pepito/.config/dvc
        System: /etc/xdg/dvc
Cache types: <https://error.dvc.org/no-dvc-cache>
Caches: local
Remotes: ssh
Workspace directory: xfs on /dev/etherd/e1.2
Repo: dvc, git
Repo.site_cache_dir: /var/tmp/dvc/repo/f4ebe3b9398bbf3512d2cac9e029947f

Additional Information:

Output of dvc fetch -v:

(env_JAR) [pepito@contractorB repo]$ dvc fetch -v
2023-05-18 17:02:23,975 DEBUG: v2.57.2 (pip), CPython 3.8.15 on Linux-5.10.0-0.bpo.7-amd64-x86_64-with-glibc2.10
2023-05-18 17:02:23,975 DEBUG: command: /home/pepito/.conda/envs/env_JAR/bin/dvc fetch -v
2023-05-18 17:02:24,349 DEBUG: Preparing to transfer data from '/work/projects/models-bin/' to '/contractorB/pepito/repo/.dvc/cache'
2023-05-18 17:02:24,349 DEBUG: Preparing to collect status from '/contractorB/pepito/repo/.dvc/cache'
2023-05-18 17:02:24,349 DEBUG: Collecting status from '/contractorB/pepito/repo/.dvc/cache'
2023-05-18 17:02:24,350 DEBUG: Preparing to collect status from '/work/projects/models-bin/'                                                                                                                                                                                                                 
2023-05-18 17:02:24,350 DEBUG: Collecting status from '/work/projects/models-bin/'
2023-05-18 17:02:24,350 DEBUG: Querying 1 oids via object_exists                                                                                                                                                                                                                                                       
2023-05-18 17:02:28,418 DEBUG: Preparing to transfer data from '/work/projects/models-bin/' to '/contractorB/pepito/repo/.dvc/cache'                                                                                                       
2023-05-18 17:02:28,418 DEBUG: Preparing to collect status from '/contractorB/pepito/repo/.dvc/cache'
2023-05-18 17:02:28,418 DEBUG: Collecting status from '/contractorB/pepito/repo/.dvc/cache'
2023-05-18 17:02:28,420 DEBUG: Preparing to collect status from '/work/projects/models-bin/'                                                                                                                                                                                                                 
2023-05-18 17:02:28,420 DEBUG: Collecting status from '/work/projects/models-bin/'
2023-05-18 17:02:32,027 ERROR: unexpected error - [Errno -2] Name or service not known                                                                                                                                                                                                                                 
Traceback (most recent call last):
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/site-packages/dvc/cli/__init__.py", line 210, in main
    ret = cmd.do_run()
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/site-packages/dvc/cli/command.py", line 26, in do_run
    return self.run()
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/site-packages/dvc/commands/data_sync.py", line 84, in run
    processed_files_count = self.repo.fetch(
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/site-packages/dvc/repo/__init__.py", line 65, in wrapper
    return f(repo, *args, **kwargs)
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/site-packages/dvc/repo/fetch.py", line 86, in fetch
    d, f = _fetch(
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/site-packages/dvc/repo/fetch.py", line 166, in _fetch
    d, f = repo.cloud.pull(
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/site-packages/dvc/data_cloud.py", line 181, in pull
    return self.transfer(
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/site-packages/dvc/data_cloud.py", line 135, in transfer
    return transfer(src_odb, dest_odb, objs, **kwargs)
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/site-packages/dvc_data/hashfile/transfer.py", line 203, in transfer
    status = compare_status(
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/site-packages/dvc_data/hashfile/status.py", line 189, in compare_status
    src_exists, src_missing = status(
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/site-packages/dvc_data/hashfile/status.py", line 149, in status
    odb.oids_exist(hashes, jobs=jobs, progress=pbar.callback)
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/site-packages/dvc_objects/db.py", line 411, in oids_exist
    remote_size, remote_oids = self._estimate_remote_size(
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/site-packages/dvc_objects/db.py", line 293, in _estimate_remote_size
    remote_oids = set(iter_with_pbar(oids))
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/site-packages/dvc_objects/db.py", line 283, in iter_with_pbar
    for oid in oids:
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/site-packages/dvc_objects/db.py", line 249, in _oids_with_limit
    for oid in self._list_oids(prefixes=prefixes, jobs=jobs):
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/site-packages/dvc_objects/db.py", line 236, in _list_oids
    for path in self._list_prefixes(prefixes=prefixes, jobs=jobs):
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/site-packages/dvc_objects/db.py", line 216, in _list_prefixes
    yield from self.fs.find(paths, batch_size=jobs, prefix=prefix)
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/site-packages/dvc_objects/fs/base.py", line 429, in find
    yield from self.fs.find(path)
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/site-packages/funcy/objects.py", line 50, in __get__
    return prop.__get__(instance, type)
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/site-packages/funcy/objects.py", line 28, in __get__
    res = instance.__dict__[self.fget.__name__] = self.fget(instance)
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/site-packages/dvc_ssh/__init__.py", line 119, in fs
    return _SSHFileSystem(**self.fs_args)
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/site-packages/fsspec/spec.py", line 76, in __call__
    obj = super().__call__(*args, **kwargs)
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/site-packages/sshfs/spec.py", line 66, in __init__
    self._client, self._pool = self.connect(
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/site-packages/fsspec/asyn.py", line 113, in wrapper
    return sync(self.loop, func, *args, **kwargs)
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/site-packages/fsspec/asyn.py", line 98, in sync
    raise return_result
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/site-packages/fsspec/asyn.py", line 53, in _runner
    result[0] = await coro
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/asyncio/tasks.py", line 494, in wait_for
    return fut.result()
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/site-packages/sshfs/utils.py", line 27, in wrapper
    return await func(*args, **kwargs)
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/site-packages/sshfs/spec.py", line 83, in _connect
    client = await self._stack.enter_async_context(_raw_client)
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/contextlib.py", line 568, in enter_async_context
    result = await _cm_type.__aenter__(cm)
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/site-packages/asyncssh/misc.py", line 274, in __aenter__
    self._coro_result = await self._coro
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/site-packages/asyncssh/connection.py", line 8042, in connect
    return await asyncio.wait_for(
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/asyncio/tasks.py", line 455, in wait_for
    return await fut
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/site-packages/asyncssh/connection.py", line 430, in _connect
    _, session = await loop.create_connection(
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/asyncio/base_events.py", line 986, in create_connection
    infos = await self._ensure_resolved(
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/asyncio/base_events.py", line 1365, in _ensure_resolved
    return await loop.getaddrinfo(host, port, family=family, type=type,
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/asyncio/base_events.py", line 825, in getaddrinfo
    return await self.run_in_executor(
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/concurrent/futures/thread.py", line 57, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/socket.py", line 918, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno -2] Name or service not known

2023-05-18 17:02:32,075 DEBUG: Version info for developers:
DVC version: 2.57.2 (pip)
-------------------------
Platform: Python 3.8.15 on Linux-5.10.0-0.bpo.7-amd64-x86_64-with-glibc2.10
Subprojects:
        dvc_data = 0.51.0
        dvc_objects = 0.22.0
        dvc_render = 0.4.0
        dvc_task = 0.2.1
        scmrepo = 1.0.3
Supports:
        http (aiohttp = 3.8.3, aiohttp-retry = 2.8.3),
        https (aiohttp = 3.8.3, aiohttp-retry = 2.8.3),
        ssh (sshfs = 2023.4.1),
        webdav (webdav4 = 0.9.8),
        webdavs (webdav4 = 0.9.8)
Config:
        Global: /home/pepito/.config/dvc
        System: /etc/xdg/dvc
Cache types: <https://error.dvc.org/no-dvc-cache>
Caches: local
Remotes: ssh
Workspace directory: xfs on /dev/etherd/e1.2
Repo: dvc, git
Repo.site_cache_dir: /var/tmp/dvc/repo/f4ebe3b9398bbf3512d2cac9e029947f

Having any troubles? Hit us up at https://dvc.org/support, we are always happy to help!
2023-05-18 17:02:32,076 DEBUG: Analytics is enabled.
2023-05-18 17:02:32,111 DEBUG: Trying to spawn '['daemon', '-q', 'analytics', '/tmp/tmpg1yozo9a']'
2023-05-18 17:02:32,112 DEBUG: Spawned '['daemon', '-q', 'analytics', '/tmp/tmpg1yozo9a']'
efiop commented 1 year ago

Hi @Jalmenara , thank you for detailed report!

Would you be willing to try using https://github.com/ronf/asyncssh directly to see if you could narrow it down to either asyncssh or dvc up the stack?

Jalmenara commented 1 year ago

Hello @efiop . Thank you for the quick answer. Indeed, I think that I have found something with asyncssh, although we must confirm. These are the details:

  1. First, I tried pip install asyncssh on the conda environment. It does nothing, since all "Requirement already satisfied" (this is the output; you can see the versions):
Requirement already satisfied: asyncssh in /home/.conda/envs/env_JAR/lib/python3.8/site-packages (2.13.1)
Requirement already satisfied: typing-extensions>=3.6 in /home/.conda/envs/env_JAR/lib/python3.8/site-packages (from asyncssh) (4.4.0)
Requirement already satisfied: cryptography>=3.1 in /home/.conda/envs/env_JAR/lib/python3.8/site-packages (from asyncssh) (38.0.4)
Requirement already satisfied: cffi>=1.12 in /home/.conda/envs/env_JAR/lib/python3.8/site-packages (from cryptography>=3.1->asyncssh) (1.15.1)
Requirement already satisfied: pycparser in /home/.conda/envs/env_JAR/lib/python3.8/site-packages (from cffi>=1.12->cryptography>=3.1->asyncssh) (2.21)

Then, I adapted some basic examples of the docs and tested them: https://asyncssh.readthedocs.io/en/stable/#client-examples

  1. Basic connection: It runs fine (prints to console the b0 folder I mentioned in the opening post)
    
    import asyncio, asyncssh, sys

async def run_client() -> None: async with asyncssh.connect('destination') as conn: result = await conn.run('ls /work/projects/models-bin/', check=True) print(result.stdout, end='')

try: asyncio.get_event_loop().run_until_complete(run_client()) except (OSError, asyncssh.Error) as exc: sys.exit('SSH connection failed: ' + str(exc))


3. Copy through SFTP: It ~~fails. We have found something~~ runs (see update).
The following runs fine (direct copy of the hash file with `sftp.get()`):

import asyncio, asyncssh, sys

async def run_client() -> None: async with asyncssh.connect('destination') as conn: async with conn.start_sftp_client() as sftp: await sftp.get('/work/projects/models-bin/b0/26324c6904b2a9cb4b88d6d61c81d1')

try: asyncio.get_event_loop().run_until_complete(run_client()) except (OSError, asyncssh.Error) as exc: sys.exit('SFTP operation failed: ' + str(exc))

~~However, changing the `get` line to the whole folder (recursively) throws an error:~~ (see update below)

await sftp.get('/work/projects/models-bin/', preserve=True, recurse=True) # Causes an error

```console
Exception has occurred: SystemExit
SFTP operation failed: [Errno 2] No such file or directory: b''

~~So it seems that there is a problem with the browsing of the folders (it looks for some weird b'' directory, without the 0 at the end). I think this is weird, since the situation is the basic one for dvc.~~

OK, small update. I tested sftp.get without the slash at the end and it worked. Also, the preserve option makes no difference.

await sftp.get('/work/projects/models-bin', recurse=True) 

This led me to try to remove the end slash at the .dvc/config file:

[core]
    remote = bin-remote
['remote "bin-remote"']
    url = ssh://destination/work/projects/models-bin

However, this did not solve the problem.

How does exactly dvc uses asyncssh? What is the exact copying method used under the hood?

Jalmenara commented 1 year ago

Additional info:

  1. Copy through scp: it works
    
    import asyncio, asyncssh, sys

async def run_client() -> None: await asyncssh.scp('destination:/work/projects/models-bin', '.', recurse=True)

try: asyncio.get_event_loop().run_until_complete(run_client()) except (OSError, asyncssh.Error) as exc: sys.exit('SFTP operation failed: ' + str(exc))

efiop commented 1 year ago

@Jalmenara Thanks for trying it out! We use asyncssh through https://github.com/fsspec/sshfs and to get a file we use this https://github.com/fsspec/sshfs/blob/a62fd30cfcf55ef74345a0cc398f5779a1577ffa/sshfs/spec.py#L171 But your original traceback seems to point to just connecting that was failing.

Jalmenara commented 1 year ago

@efiop , thanks for your time! OK, so the thing is... how is it possible that connecting is failing, if both scp and sftp methods of asyncssh work fine? What do you mean exactly by connect ing?

efiop commented 1 year ago

@Jalmenara I mean your initial log seems to point us to connection failing:

  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/site-packages/asyncssh/connection.py", line 8042, in connect
    return await asyncio.wait_for(
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/asyncio/tasks.py", line 455, in wait_for
    return await fut
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/site-packages/asyncssh/connection.py", line 430, in _connect
    _, session = await loop.create_connection(
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/asyncio/base_events.py", line 986, in create_connection
    infos = await self._ensure_resolved(
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/asyncio/base_events.py", line 1365, in _ensure_resolved
    return await loop.getaddrinfo(host, port, family=family, type=type,
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/asyncio/base_events.py", line 825, in getaddrinfo
    return await self.run_in_executor(
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/concurrent/futures/thread.py", line 57, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/socket.py", line 918, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno -2] Name or service not known

Are you sure you are still getting the same kind of error?

Jalmenara commented 1 year ago

Ah, alright. Sorry for my bad understanding... @efiop

Yes, i am still getting the same error, even after the tests with scp, sftp, etc. This is another log from dvc fetch -v (I just got it). How could I debug the call to asyncio.create_connection() in the context of the dvc fetch command?

2023-05-22 18:29:08,530 DEBUG: v2.57.2 (pip), CPython 3.8.15 on Linux-5.10.0-0.bpo.7-amd64-x86_64-with-glibc2.10
2023-05-22 18:29:08,530 DEBUG: command: /home/pepito/.conda/envs/env_JAR/bin/dvc fetch -v
2023-05-22 18:29:09,384 DEBUG: Preparing to transfer data from '/work/projects/models-bin/mag_vv_cry-sdyn/' to '/projects/pepito/mag_vv_cry-sdyn/.dvc/cache'
2023-05-22 18:29:09,384 DEBUG: Preparing to collect status from '/projects/pepito/mag_vv_cry-sdyn/.dvc/cache'
2023-05-22 18:29:09,384 DEBUG: Collecting status from '/projects/pepito/mag_vv_cry-sdyn/.dvc/cache'
2023-05-22 18:29:09,390 DEBUG: Preparing to collect status from '/work/projects/models-bin/mag_vv_cry-sdyn/'                                                                                            
2023-05-22 18:29:09,390 DEBUG: Collecting status from '/work/projects/models-bin/mag_vv_cry-sdyn/'
2023-05-22 18:29:09,925 ERROR: unexpected error - [Errno -2] Name or service not known                                                                                                                            
Traceback (most recent call last):
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/site-packages/dvc/cli/__init__.py", line 210, in main
    ret = cmd.do_run()
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/site-packages/dvc/cli/command.py", line 26, in do_run
    return self.run()
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/site-packages/dvc/commands/data_sync.py", line 84, in run
    processed_files_count = self.repo.fetch(
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/site-packages/dvc/repo/__init__.py", line 65, in wrapper
    return f(repo, *args, **kwargs)
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/site-packages/dvc/repo/fetch.py", line 86, in fetch
    d, f = _fetch(
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/site-packages/dvc/repo/fetch.py", line 166, in _fetch
    d, f = repo.cloud.pull(
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/site-packages/dvc/data_cloud.py", line 181, in pull
    return self.transfer(
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/site-packages/dvc/data_cloud.py", line 135, in transfer
    return transfer(src_odb, dest_odb, objs, **kwargs)
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/site-packages/dvc_data/hashfile/transfer.py", line 203, in transfer
    status = compare_status(
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/site-packages/dvc_data/hashfile/status.py", line 189, in compare_status
    src_exists, src_missing = status(
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/site-packages/dvc_data/hashfile/status.py", line 149, in status
    odb.oids_exist(hashes, jobs=jobs, progress=pbar.callback)
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/site-packages/dvc_objects/db.py", line 411, in oids_exist
    remote_size, remote_oids = self._estimate_remote_size(
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/site-packages/dvc_objects/db.py", line 293, in _estimate_remote_size
    remote_oids = set(iter_with_pbar(oids))
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/site-packages/dvc_objects/db.py", line 283, in iter_with_pbar
    for oid in oids:
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/site-packages/dvc_objects/db.py", line 249, in _oids_with_limit
    for oid in self._list_oids(prefixes=prefixes, jobs=jobs):
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/site-packages/dvc_objects/db.py", line 236, in _list_oids
    for path in self._list_prefixes(prefixes=prefixes, jobs=jobs):
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/site-packages/dvc_objects/db.py", line 216, in _list_prefixes
    yield from self.fs.find(paths, batch_size=jobs, prefix=prefix)
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/site-packages/dvc_objects/fs/base.py", line 429, in find
    yield from self.fs.find(path)
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/site-packages/funcy/objects.py", line 50, in __get__
    return prop.__get__(instance, type)
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/site-packages/funcy/objects.py", line 28, in __get__
    res = instance.__dict__[self.fget.__name__] = self.fget(instance)
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/site-packages/dvc_ssh/__init__.py", line 119, in fs
    return _SSHFileSystem(**self.fs_args)
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/site-packages/fsspec/spec.py", line 76, in __call__
    obj = super().__call__(*args, **kwargs)
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/site-packages/sshfs/spec.py", line 66, in __init__
    self._client, self._pool = self.connect(
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/site-packages/fsspec/asyn.py", line 113, in wrapper
    return sync(self.loop, func, *args, **kwargs)
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/site-packages/fsspec/asyn.py", line 98, in sync
    raise return_result
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/site-packages/fsspec/asyn.py", line 53, in _runner
    result[0] = await coro
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/asyncio/tasks.py", line 494, in wait_for
    return fut.result()
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/site-packages/sshfs/utils.py", line 27, in wrapper
    return await func(*args, **kwargs)
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/site-packages/sshfs/spec.py", line 83, in _connect
    client = await self._stack.enter_async_context(_raw_client)
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/contextlib.py", line 568, in enter_async_context
    result = await _cm_type.__aenter__(cm)
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/site-packages/asyncssh/misc.py", line 274, in __aenter__
    self._coro_result = await self._coro
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/site-packages/asyncssh/connection.py", line 8042, in connect
    return await asyncio.wait_for(
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/asyncio/tasks.py", line 455, in wait_for
    return await fut
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/site-packages/asyncssh/connection.py", line 430, in _connect
    _, session = await loop.create_connection(
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/asyncio/base_events.py", line 986, in create_connection
    infos = await self._ensure_resolved(
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/asyncio/base_events.py", line 1365, in _ensure_resolved
    return await loop.getaddrinfo(host, port, family=family, type=type,
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/asyncio/base_events.py", line 825, in getaddrinfo
    return await self.run_in_executor(
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/concurrent/futures/thread.py", line 57, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/home/pepito/.conda/envs/env_JAR/lib/python3.8/socket.py", line 918, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno -2] Name or service not known

2023-05-22 18:29:09,991 DEBUG: Version info for developers:
DVC version: 2.57.2 (pip)
-------------------------
Platform: Python 3.8.15 on Linux-5.10.0-0.bpo.7-amd64-x86_64-with-glibc2.10
Subprojects:
        dvc_data = 0.51.0
        dvc_objects = 0.22.0
        dvc_render = 0.4.0
        dvc_task = 0.2.1
        scmrepo = 1.0.3
Supports:
        http (aiohttp = 3.8.3, aiohttp-retry = 2.8.3),
        https (aiohttp = 3.8.3, aiohttp-retry = 2.8.3),
        ssh (sshfs = 2023.4.1),
        webdav (webdav4 = 0.9.8),
        webdavs (webdav4 = 0.9.8)
Config:
        Global: /home/pepito/.config/dvc
        System: /etc/xdg/dvc
Cache types: <https://error.dvc.org/no-dvc-cache>
Caches: local
Remotes: ssh
Workspace directory: xfs on /dev/etherd/e1.2
Repo: dvc, git
Repo.site_cache_dir: /var/tmp/dvc/repo/d2366a2b2ee6d6ec2fdbd02e25c33a47

Having any troubles? Hit us up at https://dvc.org/support, we are always happy to help!
2023-05-22 18:29:09,992 DEBUG: Analytics is enabled.
2023-05-22 18:29:10,030 DEBUG: Trying to spawn '['daemon', '-q', 'analytics', '/tmp/tmpqj37j1ff']'
2023-05-22 18:29:10,032 DEBUG: Spawned '['daemon', '-q', 'analytics', '/tmp/tmpqj37j1ff']'
pmrowla commented 1 year ago

If you run dvc with --pdb (like dvc fetch --pdb ...) you it will drop you into a PDB shell when the exception is raised.

The getaddrinfo call is failing, you should check that host and port are what you would expect given your remote and SSH config.

i.e. assuming the PDB session shows you the traceback for

    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):

you can just do

(Pdb) host

to see the value of host (and same for the other variables at that point in the traceback)