iterative / dvc

🦉 Data Versioning and ML Experiments
https://dvc.org
Apache License 2.0
13.95k stars 1.19k forks source link

Latest release of DVC breaks object versioned configs #10440

Closed SkafteNicki closed 6 months ago

SkafteNicki commented 6 months ago

Bug Report

Description

The latest release of DVC v3.51.0 breaks workflows where data is stored with version_aware = true on a remote bucket. It seems to be due to this PR which is included in the latest release: https://github.com/iterative/dvc/pull/10433

Here is a CI run from a week ago with version v3.50.3 of dvc, which succeeds: https://github.com/SkafteNicki/example_mlops/actions/runs/9112178019/job/25050892637

Here is a CI run from today with version v3.51.0 of dvc, which fails with error:

ERROR: failed to pull data from the cloud - config file error: 'fetch --run-cache' is unsupported for cloud versioned remotes: config file error: 'fetch --run-cache' is unsupported for cloud versioned remotes

https://github.com/SkafteNicki/example_mlops/actions/runs/9222033174/job/25372344887

Nothing has changed regarding the data, config of dvc etc. only the version being used by the CI.

By changing dvc pull to dvc pull --no-run-cache fixes the issue: https://github.com/SkafteNicki/example_mlops/actions/runs/9222304205/job/25373194648

Reproduce

Expected

I have already found the solution for this problem, however I would not have expected such a breaking change to happen in a minor release of DVC. I would recommend the maintainers to add to the documentation that --no-run-cache argument needs to be added when version_aware=true in dvc config (alternatively, this could maybe be auto-detected from the config and automatically set?)

On a sidenote: it seems all reference in the documentation to setting a remote storage to version aware is gone? The relevant page for this information: https://dvc.org/doc/user-guide/data-management/cloud-versioning#cloud-versioning does not really contain how to do it:

dvc remote modify remote_storage version_aware true

Environment information

Output of dvc doctor:

$ dvc doctor

Additional Information (if any):

shcheklein commented 6 months ago

Seems DVC builds are also broken here https://github.com/iterative/dvc-s3-repo