Closed DocCyblade closed 9 years ago
Thanks for testing this and discovering the issue Ken. I marked it as both a 'bug' and a 'feature request' as I think it's sort of both! :smile:
I referenced here just now https://github.com/l-arnold/tkl-nomadic-odoo/issues/11 but thought since we have moved into Tracker to put the comment here as well.
It seems we can go to /turnkey/fab and mkdir git-cache Run "git clone" to bring a full Git Set in to be available to TKLDEV. Then the App build could call "git-cache/folder" just as "common" is called in TKLDEV processes.
Would save a lot of time for sure. Would need to "git pull" once and a while to stay current is all on each of the folders that get pulled (could be a lot). Would also need some logic in how TKLDEV addresses that folder set.
@l-arnold This is more to fix the current way TKLDev tool chain handles GIT in its builds. The current proxy does not seem to cache like it does for downloading zip or packages. This idea would allow TKLDev tool chain to do this with GIT repos. Really has nothing to do with Odoo specifically, it just happen I stumbled upon it when I was working with the Odoo build.
Thinking about this a little more (and noticed that your gist is still a bare-bones template - nice template BTW :smile:)... So did you mean that Polipo (the caching proxy used in TKLDev) even when configured right won't cache git clone
operations?
Or did you mean that setting the system/env vars just wasn't working? IIRC you said that git will honour the http/https proxy settings?
Anyway just had a thought this morning (which might be blown out of the water depending on your answer to above):
git clone
accepts the -c
switch which allows you to specify config (e.g. git clone -c <key>=<value>
) for an individual git clone operation (see docs). So perhaps an easy way could be to just set up a bash function (like many apps already do with dl
to download using the proxy) within the conf script. Something like:
proxy-clone() {
/usr/bin/git clone -c http.proxy=$FAB_PROXY -c https.proxy=$FAB_PROXY --depth 1
}
# clone repo using proxy
proxy-clone https://github.com/user/repo.git
:question:
Also testing/troubleshooting this might be easier if we turn up the verbosity of git with these env vars?:
set GIT_TRACE=1
set GIT_CURL_VERBOSE=1
Good call, I am going to run a few tests.
These should shed some light to see if it is caching anything at all.
I will re-run these test with the proxy script (hope to work on that tomorrow night)
Sounds awesome! :+1:
@JedMeister Ok, so very strange, I can't seem to get anything github, even proxy request via curl to cache. I started looking into the manual pages of the proxy server TKLDev uses, learned way more than I wanted too...
Anyway googling away I stumbled upon this (http://comments.gmane.org/gmane.comp.web.polipo.user/3375)
Then it hit me... Duh! you can't cache https since, you can proxy it, but since the data is encrypted from end to end, there is no way is can cache it. Even if it did, it would be like a man in the middle attack.
So with many many downloads switching to HTTPS, this idea of a scriptable caching idea might work also with any downloads with https. It could work the same in that, it would first check the cache for the file, if it does not exist, download it, then copy it to location from cache directory. If the request url already exist does a copy. Could come up with some time stamp file to "refresh" the cache.
So I think my idea is still a good one, and maybe a https download wrapper to cache https content
FWIW I added a skeleton fab-https-dl.sh as well. Maybe one day....
Ah-ha! Great point! That didn't even occur to me!
Https caching is still quite do-able; however the proxy would need to do the https connection (to the remote server) and the connection between fab and the proxy would probably need to be http only...
Another way to go would be to essentially create a MITM proxy... Have a look at this for an idea...
Using squid instead of polipo may be another option? (A vaguely relevant looking squid tutorial here)
I return to the "manual" approach. Couldn't there be a /turnkey/fab/git-cache/ folder where Git Clones are pulled via normal "Git Clone" statements.
Would start a check to see if there is such a folder with data, then if so it is used, if not it it is git-cloned and used.
One can go via the Shell and Do that. Trick then is for TKLDEV to use that DATA in the build as it does with Common.
Just sayin. About Proxies etc I am not thinking. Only about the Data pull and its availability.
This seems as scriptable as what we are doing except that it is going outside of the "app - FAB" architecture. It is however in the same architecture as the Common files which are also used.
@l-arnold that's actually not a bad idea. A bash function that overides git so make git pull https://github.com/user/repo.git
do something like this instead:
CACHE=/var/cache/tkldev/$repo
if [ -d $CACHE ]; then
cd $CACHE
git pull origin master
else
git clone https://github.com/user/repo.git $CACHE
fi
cp $cached-repo $new-location
(Note it's not a proper script; just the essence of a script to show my thinking).
That's not a bad idea. I like it. Simple just that little bit would work for now!
A transparent proxy cache would the the goal however but that could be a good stop gap
Happy to test this if it speeds the current process.
My goal is to have an acceptable system for Odoo (I personally think it is generally working right now - or at least was last week).
If we can speed the builds with an implementation like this, lets try it. Like I said, I can TEST!
I just added some code to the Odoo conf script to cache. Works like a treat. The first build takes along time, it is downloading the whole repo, but after that it will do an remote update and clone from local.
https://github.com/DocCyblade/tkl-odoo/blob/a4861036b2b29ab935cd277502a547b0d8aeb545/conf.d/main#L65
nice. Instructions may be needed.
Well after a few builds... I realized that the build scripts are running in a CHROOT! So this will not work. You need a proxy, something out side of the chroot script. Well back to the drawing board
There is the tkl gitlab appliance that could conceivably work. Might even be married to tkldev. But I have not used it.
How about writing a folder next to common with a git clone command then figuring out how common parts are likewise brought in.
I'd say lets get the script working first.
Github should also let mobile edit a post: Ie many things are not perfect.
@l-arnold since the conf scripts run in a CHROOT so you can't access outside the CHROOT to cache the GIT repo.
If there was some sort of service or proxy on another port like the web proxy but specific to GIT repos.
Using part of the git lab code for delivery of the GIT could work, the issue with large repos like Odoo it could time out before the whole repo would be downloaded.
An Proxy that supports SSL may be he best option. Just thinking
Bugger. TBH that never occurred to me, but now you mention it it's obvious... So we'll need to go a different direction. I think that MITM https proxy might be the go...
@JedMeister - Trying to figure a way to clone a git repo specifically, however cacheing all https traffic via proxy port would be ideal. Is there away have a proxy server get content from https, then delivery the content via http internally, basically translating it from one protocol to another? You could then double proxy so that the first proxy request the url to the second proxy, this proxy requests the content over HTTP and decrypts the content serves it back at HTPP, this then the first poxy can cache.
The other idea, really just for GIT repo would be to have a fab-gitclone that would create a cache (like code I created before) since we can't do it via conf script since that is run in CHROOT. This would need to be put in another file config.git.d. This would require a retooling in the fab process/tools and looks like a large undertaking. However, since a lot of software these days seems to be using github for installation it may be a value
Another idea, add an additional Makefile that would use code to cache the repo. Not sure how I can make when this Makefile gets executed in the correct order so that it's before the conf scripts. I did some poking around the fab/make files and looks like this would be as hard as the fab-gitclone idea.
Bottom line, looks like what ever is implemented it will need to be baked into the tool chain at some point, and would need the blessing of the The core devs. I'd be happy submit some code, just want to know what direction that this would be implemented in. Maybe your next chat with lirazsiri and/or alonswartz could produce some direction and I could start creating/testing some code.
IIRC I posted a link somewhere on using squid as a MITM https proxy. I didn't read very far but the general gist of it was that traffic was https between proxy and program and proxy and website.
So for a slightly more detailed answer: All DNS querries redirect to localhost where the proxy is waiting. The proxy (falsely) identifies itself as the requested website using a(n invalid) self signed local cert. It then forwards the request to the remote site (using https) and downloads (and caches) the content. As it can decrypt the data both ways (as it has the keys) caching is no longer a problem...
Something somewhat inline with your other suggestion would be doable for git. To work around the chroot factor; you could mount a cache dir within the container and always use that dir. You would also want a function (or temporary alias for git) which would do a git pull
if the dir already exists (rather than failing as git does now). You would then need to copy (rsync?) the git repo to where you want it afterwards...
If you think that's the way we should look I'll dive in and see if I can tweak the tkldev build to use that. It would I expect this would replace the current one
Despite the fact that part of me doesn't like it (because it could so easily be abused); I think that the MITM proxy would be the best way and most similar to how it's currently done...
Understandable, it does solve https downloads at the same time not breaking anything already written. I'll give it a spin. Will be a branch off my fork of tkldev build.
This issue should probably be closed as a duplicate, as #467 really would solve this and https traffic as well.
Yep; let's close.
From a conversation found (https://github.com/l-arnold/tkl-nomadic-odoo/issues/11#issuecomment-142125653) there should be a wrapper that would "cache" or mirror GIT repos. The current system uses the built in proxy but due to the way GIT uses the HTTP as a transfer protocol it does not behave as one would expect and does not cache as we would expect.
So it would be nice to have such a script to proxy/cache git-clone command in that it would only download it once, and pull locally there after and keep the downloaded repo up-to-date via some kind of time stamped file so it's not fetching every time.
I have started an attempt (not even working at the moment just a skeleton template bash script right now) but will hack away at this on the side when I have nothing to do! :open_mouth: Check it out in my https://gist.github.com/DocCyblade