Open mszheng opened 8 years ago
Recommend it. remove mkl in the default setting.
I agree with this. It took me a long while trying to work out how to reduce my slug size, as the lowest I could get it was down to 310MB. adding nomkl instantly dropped to it 166MB. It would be great to either have this as the default or featured in the readme to increase visibility of this.
:+1: Adding nomkl
reduced my slug from 420mb to 280mb. Using scikit-learn, scipy, numpy and opencv.
+1, this project is borderline unusable on Heroku without this. Thanks for the hard work as this buildpack is a pleasure other than the slug size issues!
Putting nomkl
in conda-requirements.txt
does not seem to be helping in my case. Having following packages in conda-requirements.txt
nomkl
scipy=0.17.0
scikit-learn=0.17.1
pandas=0.18.0
nltk=3.2.1
sqlalchemy=1.0.12
joblib=0.9.4
still leads to mkl
being downloaded:
The following packages will be downloaded:
remote:
remote: package | build
remote: ---------------------------|-----------------
remote: libgcc-5.2.0 | 0 1.1 MB
remote: libgfortran-3.0.0 | 1 281 KB
remote: mkl-11.3.1 | 0 121.2 MB
remote: nomkl-1.0 | 0 402 B
remote: system-5.8 | 2 170 KB
remote: openblas-0.2.14 | 0 6.6 MB
remote: joblib-0.9.4 | py27_0 121 KB
remote: nltk-3.2.1 | py27_0 1.7 MB
remote: numpy-1.10.4 | py27_nomkl_0 6.0 MB
remote: pytz-2016.3 | py27_0 178 KB
remote: six-1.10.0 | py27_0 16 KB
remote: sqlalchemy-1.0.12 | py27_0 1.3 MB
remote: python-dateutil-2.5.2 | py27_0 236 KB
remote: scipy-0.17.0 | np110py27_3 31.3 MB
remote: pandas-0.18.0 | np110py27_0 12.0 MB
remote: scikit-learn-0.17.1 |np110py27_nomkl_0 8.6 MB
remote: ------------------------------------------------------------
remote: Total: 190.6 MB
remote:
remote: The following NEW packages will be INSTALLED:
remote:
remote: joblib: 0.9.4-py27_0
remote: libgcc: 5.2.0-0
remote: libgfortran: 3.0.0-1
remote: mkl: 11.3.1-0
remote: nltk: 3.2.1-py27_0
remote: nomkl: 1.0-0
remote: numpy: 1.10.4-py27_nomkl_0 [nomkl]
remote: openblas: 0.2.14-0
remote: pandas: 0.18.0-np110py27_0
remote: python-dateutil: 2.5.2-py27_0
remote: pytz: 2016.3-py27_0
remote: scikit-learn: 0.17.1-np110py27_nomkl_0 [nomkl]
remote: scipy: 0.17.0-np110py27_3
remote: six: 1.10.0-py27_0
remote: sqlalchemy: 1.0.12-py27_0
remote: system: 5.8-2
Am I missing something?
@evdoks For some reason this broke for me recently as well, perhaps due to a regression in Conda or a change in the way dependencies are handled. The workaround I finally stumbled on was to pass the --no-deps
flag to conda install
and explicitly list out all your packages in conda-requirements.txt
. See https://github.com/conda/conda/issues/2032#issuecomment-197163898
This means, unfortunately that you'll need to fork/edit this buildpack (which I was already doing to make it work well with my multi-buildpack setup). The line you need to change in question is in bin/steps/conda_compile
:
It previously was:
conda install --file conda-requirements.txt --yes | indent
and should be changed to:
conda install --no-deps --file conda-requirements.txt --yes | indent
@dtran320 Thanks! Having --no-depts
has solved the problem.
That's a great workaround. The transition from mkl
to nomkl
has proven... difficult.
@dtran320, @evdoks
Instead of forking and adding --no-depts
, I added nomkl
and the highest order package I needed: scikit-learn, but specified nomkl on that dependency.
My conda-requirements.txt
:
nomkl
scikit-learn=0.18.1=np111py27_nomkl_0
That allowed me to use this conda buildpack and let the solver find the dependencies, circumventing having to specify lower level packages like numpy, scipy, etc.
@jake17007 Ah, using --no-deps
actually broke some things for us with the newest versions, so we've migrated away from that solution
Hmm, did everyone's workarounds just break? For some reason, my buildpack went back to mkl
:
The following packages will be downloaded:
package | build
---------------------------|-----------------
mkl-11.3.3 | 0 122.1 MB
openblas-0.2.19 | 0 3.0 MB
numpy-1.11.2 | py27_0 6.2 MB
scipy-0.18.1 | np111py27_0 30.9 MB
scikit-learn-0.18.1 | np111py27_0 10.9 MB
------------------------------------------------------------
Total: 173.1 MB
The following NEW packages will be INSTALLED:
mkl: 11.3.3-0
The following packages will be UPDATED:
numpy: 1.11.2-py27_nomkl_0 [nomkl] --> 1.11.2-py27_0
openblas: 0.2.14-4 --> 0.2.19-0
scikit-learn: 0.18.1-np111py27_nomkl_0 [nomkl] --> 0.18.1-np111py27_0
scipy: 0.18.1-np111py27_nomkl_0 [nomkl] --> 0.18.1-np111py27_0
In my experience, we have to do following steps.
conda install nomkl --yes | indent
mkl
, you have to install nomkl
at first.--nodeps
when installing the packages (conda install --no-deps --file conda-requirements.txt --yes | indent
).
--nodeps
, because if you create conda-requirements.txt
from conda list --export
, all dependencies are listed in the file.nomkl
version of libraries are listed in conda-requirements.txt
.
nomkl
version of packages even if you have installed nomkl
.numpy-1.11.2-py35_0.tar.bz2
, but numpy-1.11.2-py35_nomkl_0.tar.bz2
nomkl
version, its dependencies changes. (For example, nomkl
version of numpy depends on openblas
). So you have to check these if you don't use nomkl
version in your local environment.Below is my repository that succeeded to deploy to Heroku recently.
icoxfog417/machine_learning_in_application
And my patched buildpack is below(It supports Python3 also).
Since the upgrade described here (Feb 5 2016): https://www.continuum.io/blog/developer-blog/anaconda-25-release-now-mkl-optimizations, conda is defaulting to the mkl optimized numpy and scipy, which require the ~120 MB mkl package. This can easily bump the slug size over 300 MB. It's simple to work around this by specifying "nomkl" in conda-requirements.txt, but perhaps that should be the default for this buildpack.