Closed HaleTom closed 1 year ago
BTW, the argument that repos are small so it doesn't make much difference is true... except for when it isn't - and those can be very large.
My average repo size is 31MB:
% pwd
/home/ravi/.cache/aur
% for d in $(fd FETCH_HEAD); do du -sk "$(dirname "$d")/objects"; done
22044 tmux-git/tmux/objects
9488 libva-intel-driver-hybrid/intel-vaapi-driver/objects
4380 intel-hybrid-codec-driver-git/intel-hybrid-driver/objects
32112 zsync2-git/googletest/objects
2956 zsync2-git/args/objects
2312 zsync2-git/cpr/objects
532 zsync2-git/zsync2/objects
472 nvimpager-git/nvimpager/objects
96716 github-cli-git/cli/objects
44 libxxf86misc/libXxf86misc/objects
204 python-qroundprogressbar/QRoundProgressBar/objects
84 asus-wmi-screenpad-dkms-git/asus-wmi-screenpad-dkms-git/objects
1304 bees-git/bees/objects
313160 vcpkg-git/vcpkg-git/objects
1448 s3backer-git/s3backer-git/objects
% for d in $(fd FETCH_HEAD); do du -sk "$(dirname "$d")"; done | awk '{ sum += $1 } END { if (NR > 0) print sum / NR }'
31531.5
%
A hacky work-around:
[ -e "$HOME/bin/makepkg" ] && alias paru='paru --makepkg "$HOME"/bin/makepkg'
#!/bin/bash
# Override git clone for a non-full clone
# Source idea: https://github.com/Jguer/yay/issues/972#issuecomment-602309440
printf "%s: Local makepkg wrapper\n" "$0" >&2
# Override git to modify `clone` behaviour
git() {
if [[ $# -gt 1 && $1 == 'clone' ]]; then
echo "${0}: Awaiting Morganamilo/paru/issues/1104"
printf "%s: Git clone with initial args: %s\n" "$0" "$(shell-quote "$@")"
shift
if [[ $1 == '-s' || $1 == '--shared' ]]; then
# No space saving advantage with blobless clones and hardlinks
/bin/git clone "$@"
else
printf "%s: Cloning blobless.\n" "$0" "$(shell-quote "$@")"
/bin/git clone --filter=blob:none "$@"
fi
else
/bin/git "$@"
fi
}
source /bin/makepkg "$@"
Thanks for this! I didn't know about partial clones, only about shallow clones. Partial clones means versions will always be the same.
I updated my personal script to use the --makepkg
flag and use a custom makepkg.sh
. This reduced my biggest folder, ~/.cache/paru/clone/goldendict-ng-git/
, from 252.4 MB to 27.6 MB.
The only issue is that partial clones appear to be very slow. Reinstalling this package took 15:15s, most of which was the cloning time. I suppose when it updates next time, it'll be faster.
Actually this doesn't always work when there's submodules. Installing brunsli
with my wrapper I get:
...
error: unable to read sha1 file of test_exports.sh (2dbad7ab17bfaf8e0ed83364dccca7d676cbf072)
error: invalid object 100644 d878d20bf439a86eff55655e8a42e7801fdd6ff5 for 'c/highwayhash.c'
fatal: Unable to checkout '0aaf66bb8a1634ceee4b778df51a652bdf4e1f17' in submodule path 'third_party/highwayhash'
...
there is a first implemenation in makepkg, part of pacman, working when source is in the default branch, working when the src directory is cleaned on every build. for paru, it should be ok. see also in the arch wiki: https://wiki.archlinux.org/title/makepkg.
merged already:
call then is
GITFLAGS="--filter=tree:0" paru
if you want to try, and make improvement suggestions, or pull requests there?
I tried the makepkg
patch. It's not quite working for me:
$ GITFLAGS="--filter=tree:0" paru --sync brunsli
...
==> Starting build()...
CMake Error: The source directory "/home/steven/.cache/paru/clone/brunsli/src/brunsli" does not appear to contain CMakeLists.txt.
Specify --help for usage, or press the help button on the CMake GUI.
...
$ GITFLAGS="--filter=blob:none" makepkg
...
==> Extracting sources...
-> Creating working copy of brunsli git repo...
Cloning into 'brunsli'...
done.
error: unable to read sha1 file of .gitmodules (01572aa6092e04efa109cae28ea5086ee4ee69ad)
error: unable to read sha1 file of .travis.yml (0a22b9d86c2bf276eb25dc3a5dcf82ebc780d1b7)
error: unable to read sha1 file of BUILD (ccb1cc24b5735cbb6d88aa0e97bc4043605e83c8)
...
oh, i see. this thing contains submodules. i updated the patch by putting in a --recurse-submodules flag. can you please give it another try? maybe also without GITFLAGS? especially when updating repositories and doing a makepkg again would be very helpful - as this is a different path in the git.sh of makepkg.
That appears to work, but it's not doing a parital clone.
I wonder if there's a way to do a partial clone of brunsli
that fetches the required commits and builds, while being fast as well.
@stevenxxiu , what repository is not cloned partially? a submodule? how fast you expect it to be? i am using it for https://aur.archlinux.org/packages/swift-language which is huge.
Makepkg feature so out of scope for paru.
@soloturn there's an issue with your approach, detailed in:
warning: --filter is ignored in local clones; use file:// instead. #1104
@HaleTom can you please register at pacman and comment there?
@HaleTom can you please register at pacman and comment there?
@soloturn I'm not sure what you mean exactly - which URL would I use for registering at pacman?
@stevenxxiu , what repository is not cloned partially? a submodule? how fast you expect it to be? i am using it for https://aur.archlinux.org/packages/swift-language which is huge.
I don't have this installed anymore, but https://aur.archlinux.org/packages/brunsli.
I've updated my work-around script above.
There's been an Arch MR opened for `--depth 1, with a hint to replace it with the blobless option
The Arch Wiki now gives an example of how to use GITFLAGS="--filter=tree:0" makepkg
Git blobless clones only fetch the currently required objects - any older objects are fetched only if they are required.
Upsides:
The older and more active the project, the more savings.
Background reading: https://github.blog/2020-12-21-get-up-to-speed-with-partial-clone-and-shallow-clone/
TL;DR:
Solution options Perhaps this could be best achieved by adding a
--gitcloneflags
argument.Some may want to pass
--filter=tree:0
for a treeless clone or--depth=1
to get an even smaller shallow clone.Non-options Using
--gitflags
won't work as this is applied globally to git, but--filter=
is only valid forclone
Have you checked the readme and man page for this feature?
Yes
Have you checked previous issues for this feature? Yes