Closed capnfabs closed 4 years ago
Ok, picking this up again. Rather than rely upon the hardlink behaviour, I'm going to do git clone --shared
:
When the repository to clone is on the local machine, instead of using hard links, automatically setup .git/objects/info/alternates to share the objects with the source repository. The resulting repository starts out without any object of its own.
This works regardless of whether the temporary directory is on the same mount or not, which is a major advantage of relying upon hardlinks ✨
Wow, this is working really well and going much faster than I'd expected. I think having five months of noodling around in Rust code has taught me some pretty good habits, and a lot of the skills were transferable.
Still todo:
Just some logging improvements left!
Merged.
Ok, this time I think I've finally got something pretty performant with minimal hacks.
Here's a series of bash commands I used to do this on a repo with nested submodules (https://github.com/capnfabs/grouse/blob/master/test-fixtures/nested-submodules-missing-submodules.zip)
So basically, the idea is:
git submodule update
.Something that I don't love about this is -- you can't use
git submodule update --depth 1
for nested submodules, because under some circumstances, they point at commits (which you can't fetch directly from remotes). This is just a git problem in general though.I tested this all on git 2.7.4, which is the current version running on Ubuntu LTS Xenial. I think it would be good to try on git 2.1.4 (Debian Jessie) as well -- but that's probably the oldest stable version we need to support by a long way.
Quick note to performance -- apparently if the cloned repo is on the same disk, then git uses hardlinks so this is mega fast (see Local Protocols in Git On the Server)