[ ] 📖 If this PR requires documentation updates, I have created a separate PR (or issue, at least) in dvc.org and linked it here.
I've significantly optimized LFS prefetching performance. With this change, include in scmrepo.git.lfs.fetch() is now a single-item list with either a Unix filename pattern like <subdir>/** or a file path. scmrepo.git.lfs.fetch._collect_objects() still enumerates all files in the repo with fs.find("/") (that's quite fast even with many files) but scmrepo.git.lfs.fetch._filter_paths() only matches those files against a single pattern.
[x] ❗ I have followed the Contributing to DVC checklist.
[ ]
📖 If this PR requires documentation updates, I have created a separate PR (or issue, at least) in dvc.org and linked it here.I've significantly optimized LFS prefetching performance. With this change,
include
inscmrepo.git.lfs.fetch()
is now a single-item list with either a Unix filename pattern like<subdir>/**
or a file path.scmrepo.git.lfs.fetch._collect_objects()
still enumerates all files in the repo withfs.find("/")
(that's quite fast even with many files) butscmrepo.git.lfs.fetch._filter_paths()
only matches those files against a single pattern.Partially fixes https://github.com/iterative/scmrepo/issues/338.
/cc @shcheklein