Local cache for git repositories to speed up working with large repositories and multiple clones.
The basic idea of gitcache is to use a local bare mirror that is updated when needed and used as the source repository for multiple local repositories.
git
command for easy integration.gitcache
command.gitcache is designed to be used as a wrapper to git, so in the following we show how gitcache translates the git commands for the individual operations.
When the user issues a
git clone https://github.com/seeraven/gitcache.git
for the first time, the repository https://github.com/seeraven/gitcache.git is
cloned into a bare mirror $GITCACHE_DIR/mirrors/github.com/seeraven/gitcache/git
and then the git command is rewritten to
git clone $GITCACHE_DIR/mirrors/github.com/seeraven/gitcache/git gitcache
to create the clone. In addition, the push URL of the clone is adjusted to the upstream URL.
Whenever the user issues another git clone
command of that repository, the
mirror is updated (if the update strategy permits it) and the local clone is
created as before.
Whenever the user performs a git pull
or git fetch
on that local clone,
gitcache checks whether the repository is handled by gitcache (that is the pull
URL is pointing to the mirror, the push URL is pointing to the upstream URL).
If it is, it updates the mirror first (according to the update strategy) and
executes the original command afterwards.
In addition to the git repositories, gitcache supports git-lfs as well and updates of the mirror include updates of the git-lfs part. You can configure gitcache to either use a global git-lfs storage directory or to use per mirror storage directories (the default).
All update operations on a mirror use a lock to ensure that only one modifies the mirror. This is crucial as simultaneous clones would easily lead to inconsistent behaviours and ugly race conditions.
The mirror update strategy is controlled using the so called update interval. It gives the time between two updates of a mirror in seconds and allows you to save network bandwidth by avoiding multiple updates at almost the same time.
In addition, updates from the git pull
and git fetch
commands can be
completely disabled by setting it to a negative value. This means that updates
of the mirrors are only performed if explicitly requested by a
git update-mirrors
command. This can be useful on CI servers to control
network usage even further.
gitcache is distributed as a single executable packaged using pyInstaller.
So all you have to do is to download the latest executable and copy it to a
location of your choice, for example ~/bin
:
wget https://github.com/seeraven/gitcache/releases/download/v1.0.18/gitcache_v1.0.18_Ubuntu22.04_amd64
mv gitcache_v1.0.18_Ubuntu22.04_amd64 ~/bin/gitcache
chmod +x ~/bin/gitcache
gitcache can be used as a stand-alone command, but it is much easier to use
it as a wrapper to git
. All you have to do is to create a symlink and to
adjust the PATH
variable so that the wrapper is found before the real
git
command:
ln -s gitcache ~/bin/git
export PATH=$HOME/bin:$PATH
The export
statement should be added to your ~/.bashrc
file to set
it permanently.
Download the latest executable for Windows from the release page
https://github.com/seeraven/gitcache/releases. Rename the executable to
gitcache.exe
and put it into a directory in your PATH, e.g., into
C:\Windows
. Then create a symlink to git.exe
by opening a console and
executing:
cd C:\Windows
mklink git.exe gitcache.exe
Please note that the directory you are putting the symlink into should be stated before the real git command directory in your PATH variable!
A single pyInstaller executable has a huge startup delay on MacOS, therefore
gitcache is distributed as a tar-ball (*.tgz
file). Download the archive and
extract it at your desired target location (the archive contains a subfolder):
cd /my/target/destination
tar xfz gitcache_v1.0.18_Darwin_arm64.tgz
ls gitcache_v1.0.18_Darwin_arm64
To use the gitcache
command, the final installation directory should be put
into your PATH
variable. To use it as a wrapper to the git
command, you
have to create the symlink and adjust the PATH
variable so that the wrapper
is found bfore the real git
command as described on the installation on Linux
section.
gitcache stores all files under in the directory ~/.gitcache
. This base
directory can be changed by setting the GITCACHE_DIR
environment variable.
When the GITCACHE_DIR
is created, the default configuration file
GITCACHE_DIR/config
is created and populated with the default values.
The current configuration can be shown by calling
gitcache
For every item, you'll see a corresponding environment variable that can be used to overwrite the setting of the configuration file.
The configuration options are:
Category | Config Item | Default Value | Environment Variable |
---|---|---|---|
System | realgit | /usr/bin/git |
GITCACHE_REAL_GIT |
MirrorHandling | updateinterval | 0 s |
GITCACHE_UPDATE_INTERVAL |
MirrorHandling | cleanupafter | 14 days |
GITCACHE_CLEANUP_AFTER |
Command | checkinterval | 2 s |
GITCACHE_COMMAND_CHECK_INTERVAL |
Command | locktimeout | 1 h |
GITCACHE_COMMAND_LOCK_TIMEOUT |
Command | warniflockedfor | 10 s |
GITCACHE_COMMAND_WARN_IF_LOCKED_FOR |
GC | commandtimeout | 1 h |
GITCACHE_GC_COMMAND_TIMEOUT |
GC | outputtimeout | 5 m |
GITCACHE_GC_OUTPUT_TIMEOUT |
GC | retries | 3 |
GITCACHE_GC_RETRIES |
LFS | commandtimeout | 1 h |
GITCACHE_LFS_COMMAND_TIMEOUT |
LFS | outputtimeout | 5 m |
GITCACHE_LFS_OUTPUT_TIMEOUT |
LFS | permirrorstorage | True |
GITCACHE_LFS_PER_MIRROR_STORAGE |
LFS | retries | 3 |
GITCACHE_LFS_RETRIES |
Clone | commandtimeout | 1 h |
GITCACHE_CLONE_COMMAND_TIMEOUT |
Clone | outputtimeout | 5 m |
GITCACHE_CLONE_OUTPUT_TIMEOUT |
Clone | retries | 3 |
GITCACHE_CLONE_RETRIES |
Update | commandtimeout | 1 h |
GITCACHE_UPDATE_COMMAND_TIMEOUT |
Update | outputtimeout | 5 m |
GITCACHE_UPDATE_OUTPUT_TIMEOUT |
Update | retries | 3 |
GITCACHE_UPDATE_RETRIES |
UrlPatterns | includeregex | .* |
GITCACHE_URLPATTERNS_INCLUDE_REGEX |
UrlPatterns | excluderegex | (empty) | GITCACHE_URLPATTERNS_EXCLUDE_REGEX |
Configuration items that expect a time support the following values:
w
, wks
or weeks
to give the time in weeks.d
, dys
or days
to give the time in days.h
, hrs
or hours
to give the time in hours.m
, mins
or minutes
to give the time in minutes.s
, secs
or seconds
to give the time in seconds.1.5 weeks
.The following list gives a description of the configuration options:
GITCACHE_REAL_GIT
) specifies the real git command. This
is usually /usr/bin/git
but can be changed as you like.GITCACHE_UPDATE_INTERVAL
) gives the
minimum time between two mirror updates. If this is set to 0, the mirror is
updated always when needed. If you set this to something like 10 minutes
then the mirror is updated only if the last update was at least 10 minutes
ago.GITCACHE_CLEANUP_AFTER
) specifies how old
mirrors are detected. This is relevant for the gitcache -c
resp.
git cleanup
command which removes all old mirrors. The time given here
specifies the time since the last update of the mirror.GITCACHE_COMMAND_CHECK_INTERVAL
) option specifies
at what time interval a locked mirror is checked again. The option
Command/locktimeout specifies the total timeout after which to give up.
Finally, the Command/warniflockedfor gives the time after which the user
is warned when the mirror is locked.git commands initiated by gitcache that might take a long time are monitored
to detect stalled executions. The monitoring is implemented by looking at
the stdout/stderr output and the command is assumed to be stalled when there
was no output received within a certain time. This timeout is given in the
configuration options GC/outputtimeout (GITCACHE_GC_COMMAND_TIMEOUT
),
LFS/outputtimeout (outputtimeout
), Clone/outputtimeout
(GITCACHE_CLONE_OUTPUT_TIMEOUT
) and Update/outputtimeout
(GITCACHE_UPDATE_OUTPUT_TIMEOUT
) for the corresponding git operations
garbage collection, lfs file retrieval, clone and update.
In addition, a total timeout for each of these groups is given by the
options GC/commandtimeout (GITCACHE_GC_COMMAND_TIMEOUT
),
LFS/commandtimeout (GITCACHE_LFS_COMMAND_TIMEOUT
),
Clone/commandtimeout (GITCACHE_CLONE_COMMAND_TIMEOUT
) and
Update/commandtimeout (GITCACHE_UPDATE_COMMAND_TIMEOUT
).
If an operation fails, it is retried before finally giving up. This is
configured by the GC/retries (GITCACHE_GC_RETRIES
),
LFS/retries (GITCACHE_LFS_RETRIES
), Clone/retries
(GITCACHE_CLONE_RETRIES
) and Update/retries (GITCACHE_UPDATE_RETRIES
)
options.
GITCACHE_LFS_PER_MIRROR_STORAGE
) is a boolean
flag that determines whether each mirror will have its own lfs storage
directory (True
) or whether a shared directory is used (False
).GITCACHE_URLPATTERNS_INCLUDE_REGEX
) and
UrlPatterns/excluderegex (GITCACHE_URLPATTERNS_EXCLUDE_REGEX
) are
used to identify repositories to mirror. The patterns are checked against
the remote URL of a repository and it is only mirrored if the include
pattern matches and the exclude pattern does not. If the exclude pattern
is empty, it is internally converted into a regex that matches nothing
(as an empty string would actually match always which would exclude all
URLs).The gitcache command provides the following options:
-h
, --help
to show the command help.-c
, --cleanup
to remove all outdated mirrors.-u
, --update-all
to update all mirrors ignoring the update interval.-d MIRROR
, --delete MIRROR
to delete a mirror identified by its upstream
URL or its path in the cache. This option can be specified multiple times.-s
, --show-statistics
to show the statistics of gitcache.-z
, --zero-statistics
to clear the statistics.Without any options the gitcache command shows the current configuration.
When called as gitcache git ...
it wraps the given git command as described in
the next section.
The following git commands are handled specially. All other commands are forwarded to the real git command.
git cleanup
to remove all outdated mirrors.git update-mirrors
to update all mirrors ignoring the update interval.git delete-mirror
to delete a mirror identified by its upstream URL or
its path in the cache.git ls-remote
to update the mirror and using it for the remote source
of the ls-remote command.git checkout
to perform an lfs fetch for specified refs.git clone
to create or update the mirror and clone from the mirror.git lfs fetch
to fetch the lfs handled files for the mirror.git lfs pull
to fetch the lfs handled files for the mirror.git pull
to update the mirror before updating the clone.git fetch
to update the mirror before updating the clone.git submodule init
to allow correct initialization of the submodules.git submodule update
to call the gitcache for every submodule.For debugging, set the environment variable GITCACHE_LOGLEVEL
to Debug
:
GITCACHE_LOGLEVEL=Debug gitcache
The main idea behind gitcache is to perform the caching of the git repositories only for the current user. This means that you should not share the mirrored git repositories with other users, as you do not know if another user would have the permission to access the remote repository.
Releases are now automatically built if a new tag v<major>.<minor>.<revision>
is pushed to the repository. This changes the release process a little bit:
Ensure the upcoming release is fully tested. A look on the commits on github should be enough.
Modify the CHANGELOG.md
file and insert the new version number.
Commit the modified CHANGELOG.md
file and tag the commit with the new
version number.
As soon as the new tag is pushed to github, the release is built. When it is finished, it is found as a draft on the releases page.
As github does not (yet) support Ubuntu 24.04, that release must be built manually by calling:
make releases/gitcache_v1.0.18_Ubuntu24.04_x86_64.venv.ubuntu24.04
Now edit the release draft, insert the changes from the CHANGELOG.md
file
and upload the Ubuntu 24.04 binary. Then the release can be saved as a
regular release.
Now prepare the next version. Edit the files Makefile
, pyproject.toml
,
src/git_cache/git_cache_command.py
and doc/source/installation.rst
and
replace the version number:
sed -i 's/1.0.18/1.0.18/g' Makefile pyproject.toml src/git_cache/git_cache_command.py doc/source/installation.rst