iterative / dvc

🦉 ML Experiments and Data Management with Git
https://dvc.org
Apache License 2.0
13.36k stars 1.16k forks source link

repro --pull implies --allow-missing #10434

Closed dberenbaum closed 1 month ago

dberenbaum commented 1 month ago

See https://github.com/iterative/dvc/issues/10412#issuecomment-2103410493. When we introduced --allow-missing, we also updated the behavior of --pull, but kept them as separate flags for a couple reasons (see discussion here):

Since then, every user I have encountered expects the behavior to be what --pull --allow-missing does. I don't see any valid case where someone using --pull would not want --allow-missing (we are basically already implying this by treating missing data as needing to be pulled).

dberenbaum commented 1 month ago

This raised a couple other issues when I started looking deeper:

  1. With this PR, --pull may skip unchanged stages without pulling them since --allow-missing is intended to skip these stages entirely. This saves time pulling that data but also may be unexpected/more breaking than if we pull unstaged changes.
  2. Regardless of this PR, --allow-missing will skip pull and checkout for unchanged stages, but it will still checkout unchanged stages from the local cache. This may also be expensive, and arguably we should skip checkout for these.

I'm working on an alternate PR that would address these questions.

dberenbaum commented 1 month ago

This raised a couple other issues when I started looking deeper:

1. With this PR, `--pull` may skip unchanged stages without pulling them since `--allow-missing` is intended to skip these stages entirely. This saves time pulling that data but also may be unexpected/more breaking than if we pull unstaged changes.

2. Regardless of this PR, `--allow-missing` will skip pull and checkout for unchanged stages, but it will still checkout unchanged stages from the local cache. This may also be expensive, and arguably we should skip checkout for these.

I'm working on an alternate PR that would address these questions.