Closed dberenbaum closed 1 month ago
This raised a couple other issues when I started looking deeper:
--pull
may skip unchanged stages without pulling them since --allow-missing
is intended to skip these stages entirely. This saves time pulling that data but also may be unexpected/more breaking than if we pull unstaged changes.--allow-missing
will skip pull and checkout for unchanged stages, but it will still checkout unchanged stages from the local cache. This may also be expensive, and arguably we should skip checkout for these.I'm working on an alternate PR that would address these questions.
This raised a couple other issues when I started looking deeper:
1. With this PR, `--pull` may skip unchanged stages without pulling them since `--allow-missing` is intended to skip these stages entirely. This saves time pulling that data but also may be unexpected/more breaking than if we pull unstaged changes. 2. Regardless of this PR, `--allow-missing` will skip pull and checkout for unchanged stages, but it will still checkout unchanged stages from the local cache. This may also be expensive, and arguably we should skip checkout for these.
I'm working on an alternate PR that would address these questions.
See https://github.com/iterative/dvc/issues/10412#issuecomment-2103410493. When we introduced
--allow-missing
, we also updated the behavior of--pull
, but kept them as separate flags for a couple reasons (see discussion here):--pull
too muchSince then, every user I have encountered expects the behavior to be what
--pull --allow-missing
does. I don't see any valid case where someone using--pull
would not want--allow-missing
(we are basically already implying this by treating missing data as needing to be pulled).