heroku-python / conda-buildpack

[DEPRECATED] Buildpack for Conda.
MIT License
157 stars 251 forks source link

Slug size limit soon exceeded #16

Closed Marco-Santoni closed 9 years ago

Marco-Santoni commented 9 years ago

I wrote already in the Heroku forum, but I thought to post an issue as my problem is actually quite specific to conda-buildpack.

I need to run numpy, scipy and sklearn on Python 3.4. However, the deployment does not succeed because the size limit of 300MB is exceeded by the slug (354MB).

This is what the conda-requirements.txt look like:

conda=3.15.1=py34_0
conda-env=2.3.0=py34_0
python=3.4.3=0
pip=7.1.0=py34_0
setuptools=18.0.1=py34_0
openssl=1.0.1k=1
# what is really required
numpy=1.9.2=py34_0
scikit-learn=0.16.1=np19py34_0
scipy=0.16.0=np19py34_0

As the above leads to over 300MB, is there something I can get rid of? Is there a better approach to use Python 3.4 and these packages?

talumbau commented 9 years ago

Are you certain you need conda and conda-env packages in your conda-requirements.txt? For my use case with numpy code on Heroku, I haven't found that necessary. Are you doing something special with conda environments?

Marco-Santoni commented 9 years ago

As I'm using Python 3, I added conda and conda-env as suggested by https://github.com/kennethreitz/conda-buildpack/issues/14#issue-69734522. If I do not add them to conda-requirements.txt, I get the following error:

Traceback (most recent call last):
  File "/app/.heroku/miniconda/bin/conda", line 3, in <module>
    from conda.cli import main
ImportError: No module named 'conda'
icoxfog417 commented 9 years ago

I could deploy to the Heroku by below conda-requirements.txt (Slug size is 106.5MB).

number_recognizer/conda-requirements.txt

I wish it will be useful for you.

Marco-Santoni commented 9 years ago

Thank you for the reference. I have tried to deployed using the exact same conda-requirements.txt as the one you suggested. Surprisingly, the final slug size is 361.1MB instead of 106.5MB. So, I am still not able to deploy.

I don't know where the extra 254.6MB come from. The cause is not in requirements.txt. If I remove the requirements in requirements.txt, the slug size is still large (359.0MB). The repo is just source code, so it's size is definitely not causing that amount of data.

talumbau commented 9 years ago

I wonder if you could use the heroku repo plug-in to purge the cache:

https://github.com/heroku/heroku-repo

Also, here is a strange idea but one that may work. Fork this build-pack and add some lines after the installation of all of your conda packages. For example, in /bin/steps/conda-compile, after this chunk:

if [ -f conda-requirements.txt ]; then
    puts-step "Installing dependencies using Conda"
    conda install --file conda-requirements.txt --yes | indent
fi

maybe add these lines:

conda remove conda==3.15.1
conda remove conda-env==2.3.0

or whatever the particular command is to match up the package version. Even more aggressive would be something like:

rm -rf $BUILD_DIR/miniconda3/pkgs/conda-env-*

and basically just rip out whatever you don't need after everything you do need is installed. This will very likely not work on the first go, but the general approach seems reasonable: install the stuff you need, and rip out everything you can before the buildpack finishes executing.

Marco-Santoni commented 9 years ago

Thanks @talumbau. Using the repo:purge plugin did the job and the slug size went down to 205.5MB. This closes the issue.

cjauvin commented 9 years ago

I'm using this buildpack with this conda-requirements.txt content:

numpy
pandas
scikit-learn
matplotlib
scipy

and I'm very near 400MB, even with the repo:purge plugin trick. Any suggestion?

jrkerns commented 8 years ago

You can drop ~100MB by not using MKL, which is now automatically installed for numpy.

conda install nomkl