Closed jaw closed 2 years ago
So in our case the config parameter would be:
OnlyCacheWhenURLContains "github.com"
or similar.
Hi!
Good idea! I think I'll add two configuration options, one for including in the cache as you described and one for excluding. The logic would be that to actually cache a repository during the initial clone the include pattern must match whereas the exclude pattern must not match. Checks for operations other than clone
and submodule update
are not required, since gitcache checks the remote of the checked out repository and updates the cache only if the remote is a folder within the GITCACHE_DIR
.
Default configuration would be to include all and exclude nothing, so something like
[UrlPatterns]
include = .*
exclude =
So in your case it would be something like
[UrlPatterns]
include = .*github\.com\/.*
exclude = .*
The question is of course whether a full regex should be used or a simpler shell-wildcard support suffices. Regex would allow out of the box matching of multiple different patterns, but is much harder to get it right at the first time. So an alternative would be to use the python fnmatch module which supports '' to match everything and '?' to match a single character. The pattern `.github.com\/.would become simply
github.com/. To support multiple patterns a
:is probably not a good choice as it is usually part of the URL, but
;should be quite uncommon in URLs, so multiple patterns could be specified as something like
github.com/;external.com/;https://a.single.server.com/and/this/repo`. What do you think?
I don't see a problem with regex as long as one gives some examples of how to match github and maybe a certain project on github as example 2.
Another option would be an array, like:
[UrlPatterns]
include = [".*github\.com\/.*", ".*bitbucket\.com\/.*"]
exclude = [".*"]
But you could go with 4 parameters, 2 easy and 2 regex:
[UrlPatterns]
include = ["github.com"]
exclude = [".*"]
include_regex = []
exclude_regex = []
If arrays are empty, don't do anything?
I've found a little time today to work on this feature (commit https://github.com/seeraven/gitcache/commit/33d4e4a359c0cbf54747e08cf3b1957067be4009). It is not finished yet. At the moment, only the clone
command is handled, and the functional tests must be extended too. But you should be able to see the gist. ;-)
Hi! I've just merged the changes and created a new release. It would be cool if you can test it in your specific scenario and give feedback on how it is going.
Yes, will do, I'll do #8 first and then look at this.
I've confirmed that this works well too, closing! Awesome!
We have a setup with multiple submodules (recursive) on an internal gitlab server. Then those in turn load in submodules from github, usually libraries. Those don't have to be updated very often while the local ones have to be updated on every build.
So, if you could add a string or regex as a config parameter that must match the URL for the cache to actually do its caching and pass everything else through would be awesome.
Default this parameter could be "" aka don't care.