fgrehm / vagrant-cachier

Caffeine reducer
http://fgrehm.viewdocs.io/vagrant-cachier
MIT License
1.08k stars 111 forks source link

Support for caching python modules #80

Closed scottmuc closed 9 years ago

scottmuc commented 10 years ago

I'm a bit of a newb when it comes to python packaging so I don't know the details. Like rubygems, pip is used to install packages from remote repositories.

fgrehm commented 10 years ago

There has been an issue related to this on the past that stayed open for a long time without resolution.

If someone is able to show us how this could possibly work I'd be up for having a look at it.

@scottmuc are you up for taking a stab at it?

scottmuc commented 10 years ago

@fgrehm, thanks for the background. I would love to take a stab at it. It involves two of my favourite things. Vagrant, and making things faster! What kind of time box are we looking at? I've been wanting to get my hands dirty with vagrant internals.

tmatilai commented 10 years ago

@scottmuc I :heart: your attitude! :+1:

scottmuc commented 10 years ago

There's just way too many good vibes in this project! :-)

fgrehm commented 10 years ago

@scottmuc That's the kind of attitude we need :heart: :smiley:

When you say "time box", do you mean how long do I think it should take to implement?

scottmuc commented 10 years ago

Hehe, I meant time box as: the amount of time you're willing to leave this issue open.

fgrehm commented 10 years ago

Oh, for as long as you are working on it and you pong me back when I ping you if I end up doing some more house cleaning on the future :-)

If you need any help just let me know!

scottmuc commented 10 years ago

Sounds good, looking forward to diving in.

Cheers!

egasimus commented 10 years ago

Subscribed. I'm looking forward to seeing this feature.

Some background if anyone's interested in an use case. In my current setup for developing Django apps, I've got most of my system dependencies (any packages installed via OS package manager) baked into a reusable base image using Packer; but I install any Python packages in my project-specific provisioning scripts -- and being able to cache those would be really helpful. Not having to install the OS packages every time halved my VM setup time; not having to download Python packages could turn what originally used to take 10+ minutes into a quick one-minute job, and allow me to continue working when I'm on the road, or otherwise don't have a reliable Internet connection.

scottmuc commented 10 years ago

@egasimus, glad to hear you're interested. I'll definitely need your help to test whether or not this actually works. FYI, I won't be working on this until after Feb 24th. Heading to Tanzania to climb Kilimanjaro in a couple days :+1:

thedrow commented 10 years ago

@scottmuc Have fun! Try the Mount Meru as well while you're there. It's a tougher climb but the sunrise is worth it.

patcon commented 10 years ago

...Kilimanjaro? haha oh, you ThoughtWorks characters... :smile_cat:

patcon commented 10 years ago

Seems like this is the ticket: https://github.com/pypa/pip/pull/1572

http://pip.readthedocs.org/en/latest/user_guide.html#config-file

# ~/.pip/pip.conf
[install]
wheel-cache = yes
wheel-cache-dir = /path/to/narnia
thedrow commented 10 years ago

@patcon Exactly what I've been waiting for

scottmuc commented 10 years ago

@thedrow Meru would have been awesome. Kili is definitely a crowded trek. Next time I guess! (here's me at the summit https://twitter.com/ScottMuc/status/436518524580679680)

@patcon thanks for the info, that'll definitely help with this. Hoping to get cracking on this soon.

patcon commented 10 years ago

Hm. We can actually get it working before that PR get's released:

# ~/.pip/pip.conf
[wheel]
wheel-dir = /var/cache/pip/wheels

[install]
find-links = /var/cache/pip/wheels

And then run this the first time:

mkdir -p /var/cache/pip/wheels
pip install wheel
pip wheel -r requirements.txt

And this like normal afterward:

# `--no-index` is optional but speeds install up a little bit,
# since it won't phone remote at all
pip install [--no-index] -r requirements.txt
patcon commented 10 years ago

I'll work on this later

fgrehm commented 10 years ago

It seems to me that no one is actively working on this and now that we have the generic bucket in place people can always fall back to that.

If someone is able to put up a PR that provides the automatic detection and setup for this I'll be more than happy to bring it in :-)

maestrofjp commented 10 years ago

We can accomplish this without a user having to create a pip.conf file. PIP lets you set configuration as environment variables using export.

You can open and assign this issue to me if you want (<- Python dev). I'm not a ruby dev, but as long as somebody can show me how to do an export command at the command line -- I think I can do a PR for you. (I'm not planning on supporting Windows guests right out of the gate)

Right now I'm do all of this using the generic bucket and using the right /var/cache/pip path when using the pip at the command line.

fgrehm commented 10 years ago

@maestrofjp thanks for reaching out. Whats the env var that needs to be set? If an env var is all we need I might be able to do that today

fgrehm commented 10 years ago

@maestrofjp is that env var you mentioned PIP_DOWNLOAD_CACHE?

maestrofjp commented 10 years ago

I used the command line switches --download /path/to/cache/ and --find-links /path/to/cache/ however I saw that env vars are now supported.

Using --download downloads the packages but doesn't install them -- you do that with `--find-linkslater. However,--download-cache`` appears to download and install them.

Let me swap that over in my config and see if that works as expected.

maestrofjp commented 10 years ago

Right now I'm running into issues with pip and using the --download-cache option:

https://github.com/pypa/pip/issues/1968

It seems that file names that are too long to handle via virtualization. It doesn't seem to break when using the --download and then doing a find-links command like I was doing originally.

So for now that would be the work around. Let me set those up as env vars and get you that information. I'd be more than willing to write the doco around it too.

maestrofjp commented 10 years ago

In the next release of pip, a new option cache-dir will be introduced. I have downloaded the latest version of PIP and using the new cache-dir appears to break when PIP package names get too long.

fgrehm commented 10 years ago

Have you tried just setting PIP_DOWNLOAD_CACHE? Also, do you have a pointer to some open source python app I could use to experiment with this? Tks in advance!

maestrofjp commented 10 years ago

The issue is that pointing the --cache-dir (in the bleeding of PIP which replaces download-cache) it breaks when using the symlink / shared dir /var/cache/pip

maestrofjp commented 10 years ago

@dstufft at the PIP project has asked for a repo case for this issue. No point in trying to implement download-cache and find-links since it will be replaced with cache-dir in the next release of PIP (1.6). I'll keep you updated.

maestrofjp commented 10 years ago

I have test repo showing that pip hangs when using a shared dir on vagrant (i.e. /var/cache/pip) using a vagrant-cachier generic bucket:

https://github.com/pypa/pip/issues/1969

maestrofjp commented 10 years ago

If the pypa/pip#1969 get resolved, then the correct env var to set for PIP 1.6+ (which is the next stable release) is:

PIP_CACHE_DIR

However right now, pypa/pip#1969 is a blocker.

fgrehm commented 10 years ago

Oh well, at least we know how to implement this when it is supported by pip. I'll keep this issue open so we can come back to it in the future

maestrofjp commented 10 years ago

I'll circle around when I hear back from Donald at PIP. On Aug 12, 2014 12:40 PM, "Fabio Rehm" notifications@github.com wrote:

Oh well, at least we know how to implement this when it is supported by pip. I'll keep this issue open so we can come back to it in the future

— Reply to this email directly or view it on GitHub https://github.com/fgrehm/vagrant-cachier/issues/80#issuecomment-51949273 .

achembarpu commented 9 years ago

+1 - I'm interested in this as well... pip-accel -This might work since it has pre-configured cache dirs. The user would have to install and use pip-accel instead of pip, but the caching would be worth it.

maestrofjp commented 9 years ago

My suggestion is head over to the bug report on pip and +1 it. That's what we are waiting on. On Nov 23, 2014 4:57 AM, "Arvind Chembarpu" notifications@github.com wrote:

+1 - I'm interested in this as well... pip-accel https://pypi.python.org/pypi/pip-accel -This might work since it has pre-configured cache dirs. The user would have to install and use pip-accel instead of pip, but the caching would be worth it.

— Reply to this email directly or view it on GitHub https://github.com/fgrehm/vagrant-cachier/issues/80#issuecomment-64114043 .

peterfarrell commented 9 years ago

FYI @arvindch - the pipe blocker is pypa/pip#1969

achembarpu commented 9 years ago

@peterfarrell , @maestrofjp - Thanks for the info!

white-hat commented 9 years ago

Hi! AFAIK pypa/pip#1969 is fixed in pip 6.1 I'd love to have a working pypi bucket :)

fgrehm commented 9 years ago

Awesome! I've been MIA for a while down here and would love to have more contributors around to keep the fire burning. I'd be more than happy to provide write access to this repo to whoever gets to implement this.

Cheers