kaschau / PEREGRINE

3D Multiblock multiphysics finite volume reacting flow solver. Implemented in Python, Kokkos, and MPI for inter- and intra-node performant parallelism.
https://kaschau.github.io/PEREGRINE/
BSD 3-Clause "New" or "Revised" License
6 stars 1 forks source link

Cannot import numpy on bridges-2 #113

Open kaschau opened 2 years ago

kaschau commented 2 years ago

On bridges-2, using many processors (5832) peregrine cannot import numpy. It is fine on 1000 cores.

kaschau commented 2 years ago

There may be an issue with OpenBLAS and multi-threading. The output of

numpy.show_config() is

  NOT AVAILABLE
blas_mkl_info:
  NOT AVAILABLE
blis_info:
  NOT AVAILABLE
openblas_info:
    libraries = ['openblas', 'openblas']
    library_dirs = ['/usr/lib64']
    language = c
    define_macros = [('HAVE_CBLAS', None)]
blas_opt_info:
    libraries = ['openblas', 'openblas']
    library_dirs = ['/usr/lib64']
    language = c
    define_macros = [('HAVE_CBLAS', None)]
lapack_armpl_info:
  NOT AVAILABLE
lapack_mkl_info:
  NOT AVAILABLE
openblas_lapack_info:
    libraries = ['openblas', 'openblas']
    library_dirs = ['/usr/lib64']
    language = c
    define_macros = [('HAVE_CBLAS', None)]
lapack_opt_info:
    libraries = ['openblas', 'openblas']
    library_dirs = ['/usr/lib64']
    language = c
    define_macros = [('HAVE_CBLAS', None)]
Supported SIMD extensions in this NumPy install:
    baseline = SSE,SSE2,SSE3
    found = SSSE3,SSE41,POPCNT,SSE42,AVX,F16C,FMA3,AVX2
    not found = AVX512F,AVX512CD,AVX512_KNL,AVX512_SKX,AVX512_CLX,AVX512_CNL,AVX512_ICL

See the posts here:

https://stackoverflow.com/questions/15639779/why-does-multiprocessing-use-only-a-single-core-after-i-import-numpy https://stackoverflow.com/questions/38659217/numpy-suddenly-uses-all-cpus https://shahhj.wordpress.com/2013/10/27/numpy-and-blas-no-problemo/ https://github.com/numpy/numpy/issues/8120

kaschau commented 2 years ago

This may actually be stemming from scipy

kaschau commented 2 years ago

Getting this crap

  File "<frozen importlib._bootstrap_external>", line 1039, in get_data
  File "bit_generator.pyx", line 43, in init numpy.random.bit_generator
BrokenPipeError: [Errno 108] Cannot send after transport endpoint shutdown: '/jet/home/kschau/software/python3/lib/python3.9/random.py'
    exec(code, run_globals)
  File "runPeregrine.py", line 7, in <module>
  File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
  File "<frozen importlib._bootstrap>", line 986, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 680, in _load_unlocked
    from . import _pickle
  File "/jet/home/kschau/software/python3/lib/python3.9/site-packages/numpy/random/_pickle.py", line 1, in <module>
  File "<frozen importlib._bootstrap_external>", line 846, in exec_module
  File "<frozen importlib._bootstrap_external>", line 982, in get_code
    import numpy as np
  File "/jet/home/kschau/software/python3/lib/python3.9/site-packages/numpy/__init__.py", line 155, in <module>
  File "<frozen importlib._bootstrap_external>", line 1039, in get_data
    from .mtrand import RandomState
  File "mtrand.pyx", line 1, in init numpy.random.mtrand
BrokenPipeError: [Errno 108] Cannot send after transport endpoint shutdown: '/jet/home/kschau/software/python3/lib/python3.9/random.py'
  File "bit_generator.pyx", line 43, in init numpy.random.bit_generator
  File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
    from . import random
  File "/jet/home/kschau/software/python3/lib/python3.9/site-packages/numpy/random/__init__.py", line 180, in <module>
  File "<frozen importlib._bootstrap>", line 986, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 680, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 846, in exec_module
  File "<frozen importlib._bootstrap_external>", line 982, in get_code
  File "<frozen importlib._bootstrap_external>", line 1039, in get_data
BrokenPipeError: [Errno 108] Cannot send after transport endpoint shutdown: '/jet/home/kschau/software/python3/lib/python3.9/random.py'
    from . import _pickle
  File "/jet/home/kschau/software/python3/lib/python3.9/site-packages/numpy/random/_pickle.py", line 1, in <module>
    from .mtrand import RandomState
  File "mtrand.pyx", line 1, in init numpy.random.mtrand
  File "bit_generator.pyx", line 43, in init numpy.random.bit_generator
  File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
  File "<frozen importlib._bootstrap>", line 986, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 680, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 846, in exec_module
  File "<frozen importlib._bootstrap_external>", line 982, in get_code
  File "<frozen importlib._bootstrap_external>", line 1039, in get_data
BrokenPipeError: [Errno 108] Cannot send after transport endpoint shutdown: '/jet/home/kschau/software/python3/lib/python3.9/random.py'
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 139 in communicator MPI_COMM_WORLD
with errorcode 1.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
    from . import _pickle
  File "/jet/home/kschau/software/python3/lib/python3.9/site-packages/numpy/random/_pickle.py", line 1, in <module>
    from .mtrand import RandomState
  File "mtrand.pyx", line 1, in init numpy.random.mtrand
  File "bit_generator.pyx", line 40, in init numpy.random.bit_generator
  File "/jet/home/kschau/software/python3/lib/python3.9/secrets.py", line 19, in <module>
    from random import SystemRandom
  File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
  File "<frozen importlib._bootstrap>", line 986, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 680, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 846, in exec_module
  File "<frozen importlib._bootstrap_external>", line 982, in get_code
  File "<frozen importlib._bootstrap_external>", line 1039, in get_data
BrokenPipeError: [Errno 108] Cannot send after transport endpoint shutdown: '/jet/home/kschau/software/python3/lib/python3.9/random.py'