iterative / dvc

🦉 Data Versioning and ML Experiments
https://dvc.org
Apache License 2.0
13.84k stars 1.18k forks source link

add: performance and reliability issues #6227

Closed skshetry closed 2 months ago

skshetry commented 3 years ago

https://github.com/iterative/dvc/blob/4e792ae61c5927ab2e5f6a6914d985d43aa705b4/dvc/repo/add.py#L266

pared commented 3 years ago

DVC uses move-then-checkout logic. It moves the file from the workspace to the cache and then checks it out again, rather than just using copy.

Wasn't this intended to enforce cache link type? I guess in case of copy it would make sense but what about others?

skshetry commented 3 years ago

For other links, the one I suggested was to change copy behaviour to be move + link that works atomically. @efiop also suggested using hardlinks instead.

dberenbaum commented 2 years ago

@skshetry Do you think we should include this as part of the data epic?

skshetry commented 2 months ago

Closed by

and, released in https://github.com/iterative/dvc/releases/tag/3.54.0.