ContinuumIO / anaconda-issues

Anaconda issue tracking
648 stars 223 forks source link

Numpy built with Accelerate framework doesn't work with Multiprocessing #133

Open jchoude opened 10 years ago

jchoude commented 10 years ago

When trying to use the Multiprocessing module in conjunction with Numpy, the script either crashes or loops endlessly.

My setup: OSX 10.9, XCode 5.1.1, Anaconda 2.0.0, conda 3.5.2

After investigation, this is due to the way multiprocessing in handled on OSX Python. Apple forbids using most Frameworks after a fork() is done, which happens when multiprocessing is used. This then causes a problem with Anaconda provided Numpy, since it is built against the Accelerate framework.

The solution for now would be to provide a Numpy package built against Atlas or OpenBLAS, which avoid the current problem.

I know that you provide MKL optimized builds of Numpy, which should avoid the problem, but the current free build is simply unusable with multiple projects.

If you look at issue numpy/numpy#4007 a procedure was posted on building Numpy against Atlas. However, this package cannot be pip installed in a Conda environment because the Python platform reported with:

python -c "import distutils.util; print(distutils.util.get_platform())"

is macosx-10.5-x86_64, which cannot allow the user the wheel that @matthew-brett prepared, since the minimal version is 10.6.

For additional information on the troubleshooting, please see this mailing list thread: http://mail.scipy.org/pipermail/nipy-devel/2014-June/010166.html

beltergd commented 10 years ago

If this is caused by threading in Accelerate, you may be able to work around this by disabling multi-threading in the BLAS/LAPACK portion of Accelerate. Try setting the VECLIB_MAXIMUM_THREADS environment variable to 1.

export VECLIB_MAXIMUM_THREADS=1 or

include

setenv("VECLIB_MAXIMUM_THREADS", "1", true);

jchoude commented 10 years ago

This seems to work around the issue in my specific use case. Any idea if there would be a way to make that the default for people installing that version of Numpy on a system which is set up in a way similar to mine?

asmeurer commented 10 years ago

Are there potential negative consequences to changing that default?

beltergd commented 10 years ago

From the Accelerate frameworks perspective, the only draw back is performance loss due to lack of multi-threading. But if the only other options are replacing Accelerate with a single threaded ATLAS or openBLAS, then that is a moot point.

matthew-brett commented 10 years ago

The current numpy / scipy wheels are built against the posix threading version of the ATLAS libs: https://travis-ci.org/matthew-brett/numpy-atlas-binaries/jobs/28430260#L947

phantomas1234 commented 7 years ago

Unfortunately export VECLIB_MAXIMUM_THREADS=1 doesn't seem to work for me when I run the following (it hangs on the last line)

import numpy
from multiprocessing import Pool

matrix = numpy.load('matrix.npy')

print(numpy.linalg.svd(matrix))
pool = Pool(processes=4)
print(pool.apply(numpy.linalg.svd, [matrix]))

using the following matrix.npy. I am on OS X 10.12.4 and did install numpy (1.12.1) from wheel on pypi. Running numpy.__config__.show() shows the following config:

atlas_blas_info:
  NOT AVAILABLE
openblas_lapack_info:
  NOT AVAILABLE
atlas_3_10_info:
  NOT AVAILABLE
atlas_blas_threads_info:
  NOT AVAILABLE
lapack_opt_info:
    extra_compile_args = ['-msse3']
    extra_link_args = ['-Wl,-framework', '-Wl,Accelerate']
    define_macros = [('NO_ATLAS_INFO', 3), ('HAVE_CBLAS', None)]
atlas_threads_info:
  NOT AVAILABLE
atlas_3_10_threads_info:
  NOT AVAILABLE
blis_info:
  NOT AVAILABLE
atlas_info:
  NOT AVAILABLE
lapack_mkl_info:
  NOT AVAILABLE
blas_opt_info:
    extra_compile_args = ['-msse3', '-I/System/Library/Frameworks/vecLib.framework/Headers']
    extra_link_args = ['-Wl,-framework', '-Wl,Accelerate']
    define_macros = [('NO_ATLAS_INFO', 3), ('HAVE_CBLAS', None)]
atlas_3_10_blas_threads_info:
  NOT AVAILABLE
atlas_3_10_blas_info:
  NOT AVAILABLE
openblas_info:
  NOT AVAILABLE
blas_mkl_info:
  NOT AVAILABLE

Using a smaller example matrix didn't cause the program to hang though.