iterative / dvc

🦉 ML Experiments and Data Management with Git
https://dvc.org
Apache License 2.0
13.36k stars 1.16k forks source link

dvc fetch: hangs forever on "Fetching" #10410

Closed willcray closed 1 month ago

willcray commented 2 months ago

Bug Report

Issue name

dvc fetch: hangs forever on "Fetching"

Description

Running dvc fetch hangs forever. I've left it running for several hours on a Mac and two Ubuntu 22.04 machines. This didn't use to happen. I'm using a GCP versioned bucket as the remote. I can download the files using gsutil using the same credential file I'm using for the DVC project.

Reproduce

dvc pull -v

Expected

I would expect the data to start being fetched from the remote to the local cache

Environment information

Ubuntu 22.04

Output of dvc doctor:

$ dvc doctor
DVC version: 3.50.0 (pip)
-------------------------
Platform: Python 3.10.9 on Linux-5.15.0-69-generic-x86_64-with-glibc2.35
Subprojects:
    dvc_data = 3.15.1
    dvc_objects = 5.1.0
    dvc_render = 1.0.2
    dvc_task = 0.4.0
    scmrepo = 3.3.1
Supports:
    gs (gcsfs = 2024.3.1),
    http (aiohttp = 3.9.5, aiohttp-retry = 2.8.3),
    https (aiohttp = 3.9.5, aiohttp-retry = 2.8.3)
Config:
    Global: /home/dexterity/.config/dvc
    System: /etc/xdg/dvc
Cache types: hardlink, symlink
Cache directory: ext4 on /dev/nvme0n1p2
Caches: local
Remotes: gs
Workspace directory: ext4 on /dev/nvme0n1p2
Repo: dvc (subdir), git
Repo.site_cache_dir: /var/tmp/dvc/repo/794e4964c06325c359d2b9a7d7e63f2c

Additional Information (if any):

dberenbaum commented 2 months ago

Are you using a version_aware remote?

willcray commented 1 month ago

Are you using a version_aware remote?

Yes:

[core]
    remote = gcp-remote
    autostage = true
['remote "gcp-remote"']
    url = gs://datasets/project
    version_aware = true
dberenbaum commented 1 month ago

See the comment in https://github.com/iterative/dvc/issues/10306#issuecomment-2079631194:

Took a look and seems this is caused by this commit:

iterative/dvc-data@f398036#diff-89f845ba2a0911623cfc247bbdb34218d79bbbacec33b3af2621d64a24d28557

Unfortunately, I don't see a quick fix, and we are moving towards dropping support for version-aware remotes due to lots of small issues and inconsistencies like this one, so I am going to close this and suggest using traditional remotes to avoid these problems.

Also related: https://github.com/iterative/dvc/issues/9968

dberenbaum commented 1 month ago

Sorry for the inconvenience @willcray but I suggest moving to a traditional remote. We are not continuing to support version_aware remotes, so I'm closing this issue.

willcray commented 1 month ago

@dberenbaum that is unfortunate. We will likely move away from DVC as a result of no longer supporting it.