Open dberenbaum opened 2 months ago
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 90.46%. Comparing base (
dd2b515
) to head (d510ad6
). Report is 2 commits behind head on main.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Ping @skshetry
I'll need to investigate. But we set cache.dir
to point to the local cache, which should have made the DVCFileSystem use that cache.
I am a bit hesitant of this PR as it goes against https://github.com/iterative/dvc/pull/9246, and adds another (complex) codepath.
Yes, I looked into it, and the local cache dir is set, but it's not enough because it's doing download/copy and not checkout.
download()
is defined here:
It calls generic.copy()
(https://github.com/iterative/dvc-objects/blob/ae8b95d7ea3bbcf46fc5a09eddc7c4f614023d7b/src/dvc_objects/fs/generic.py#L69), which runs fs.get_file()
operations on paths and knows nothing about the cache.
I don't think there is a much simpler way to do it than this PR, and I tried to make it as narrow as possible. It uses #9246 to rely on the local cache, building the object and passing it to dvc-data's checkout. We don't have imported_file.dvc
, so there is no way to do a higher-level checkout operation AFAICT.
which runs
fs.get_file()
operations on paths and knows nothing about the cache.
fs
is a DVCFileSystem
, and knows about the cache.
fs
is aDVCFileSystem
, and knows about the cache.
Yes, but it is doing a copy from that fs to localfs. It is not linking, and even if I try to enable linking, it fails because it is working across filesystems.
Hello, is there any update with this merge request? It would help us a ton if it gets merged.
@skshetry Is there a good reason to keep this blocked? I'm happy to revert if we find time to resolve in a cleaner way later.
Fixes https://github.com/iterative/dvc/issues/10255. This is a partial reversion of https://github.com/iterative/dvc/pull/9246. I'm not sure this is the cleanest or best way to do it, but it solves the problem.
It builds the hashfile object and tries to checkout from that, falling back to download if it fails.
Note that the build step is still slow for 2.x imports.