Open dberenbaum opened 1 year ago
We are very interested in this feature. We run long on-commit dvc pipelines in CI, by the means of dvc repro
and in cases they fail we currently have to rerun everything from scratch. It would be great if intermediate results were downloadable from the remote dvc cache.
Furthermore, we experimented a bit with cloud parallelisation of pipeline stages, i.e. a stage that looks like a normal stage for dvc, actually starts various cloud jobs. It would be great if there was a way for those jobs to call dvc pull
and get the intermediate results of the previous stages. Leaving for a moment aside the question of how to transfer dvc.lock file to the remote workers and how to funnel back the results of the stages, it feels like intermediate pushes would open many workarounds for these cases. Of course it might seem like a far fetched scenario, but maybe it's another case in point in favour of this feature.
+1 for this feature
Auto-pushing checkpoints was introduced to make it easier to recover long-running model training jobs in CI. For long-running processing jobs over multiple pipeline stages, the same behavior should be available at the end of each stage in the pipeline.