ContinuumIO / anaconda-issues

Anaconda issue tracking
648 stars 223 forks source link

Numpy (Anaconda built only) freezes/reboot on a i9-7980XE 18cores machine #10832

Open ll-portes opened 5 years ago

ll-portes commented 5 years ago

Actual Behavior

Hi! I reported this issue to the Numpy developers (link), and they asked me to report here as well.

Several simple codes (eg, SVD, np.dot) from the Numpy package built by Anaconda make one of our computers completely freezes for 5-10 seconds and then reboot. This happens only on a machine with Intel i9-7980XE 18 cores cpu. The same Conda/Ubuntu environment with an i7-7700 4 cores cpu has no problems. This happens on the python command line and on a Jupyter notebook.

Remarks: 1) This issue doesn’t happen if we use Python+numpy from pip (test suggested by the Numpy team). 2) After reading a report of a similar issue with a VM (link), the solution that worked for us was :

conda install nomkl numpy scipy scikit-learn numexpr
conda remove mkl mkl-service

And then using the following code before importing Numpy:

import os
os.environ['OPENBLAS_CORETYPE']='Haswell'

3) Please, I’d like to enphisize that ours is not a VM. 4) Last week we tried to run a computation with Igraph on this machine (for the first time on this specific machine), and we got again the freeze/reboot (probably because we didn’t installed the nomkl version of Igraph).

4) I can run any test you suggest on the aforementioned machine. But I'm able to just "copy/paste" the commands because I don't have any more profound knowledge as you guys have (I even had no idea about the difference between BLAS and openBLAS, MKL etc before this problem).

Expected Behavior

The computation been done, with no freeze/reboot when using the desktop with Intel i9-7980XE 18 cores cpu.

Steps to Reproduce

Only on our machine with Intel i9-7980XE 18 cores cpu:

import numpy as np
A = np.matrix([[1.], [3.]])
B = np.matrix([[2., 3.]])
np.dot(A, B)

Remark: the same problem happens with SVD.

Anaconda or Miniconda version:

Anaconda3-2019.03-Linux-x86_64.sh

But in the first week of the issue, we tried other versions (even for Python 2) from the last year (2018), but nothing worked as expected.

Operating System:

Ubuntu 18.04.2 LTS

conda info
``` active environment : base active env location : /home/leo/anaconda3 shell level : 1 user config file : /home/leo/.condarc populated config files : conda version : 4.6.11 conda-build version : 3.17.8 python version : 3.7.3.final.0 base environment : /home/leo/anaconda3 (writable) channel URLs : https://repo.anaconda.com/pkgs/main/linux-64 https://repo.anaconda.com/pkgs/main/noarch https://repo.anaconda.com/pkgs/free/linux-64 https://repo.anaconda.com/pkgs/free/noarch https://repo.anaconda.com/pkgs/r/linux-64 https://repo.anaconda.com/pkgs/r/noarch package cache : /home/leo/anaconda3/pkgs /home/leo/.conda/pkgs envs directories : /home/leo/anaconda3/envs /home/leo/.conda/envs platform : linux-64 user-agent : conda/4.6.11 requests/2.21.0 CPython/3.7.3 Linux/4.15.0-47-generic ubuntu/18.04.2 glibc/2.27 UID:GID : 1000:1000 netrc file : None offline mode : False ```
conda list --show-channel-urls
``` WARNING: The conda.compat module is deprecated and will be removed in a future release. # packages in environment at /home/leo/anaconda3: # # Name Version Build Channel _ipyw_jlab_nb_ext_conf 0.1.0 py37_0 defaults alabaster 0.7.12 py37_0 defaults anaconda-client 1.7.2 py37_0 defaults anaconda-navigator 1.9.7 py37_0 defaults anaconda-project 0.8.2 py37_0 defaults asn1crypto 0.24.0 py37_0 defaults astroid 2.2.5 py37_0 defaults astropy 3.1.2 py37h7b6447c_0 defaults atomicwrites 1.3.0 py37_1 defaults attrs 19.1.0 py37_1 defaults babel 2.6.0 py37_0 defaults backcall 0.1.0 py37_0 defaults backports 1.0 py37_1 defaults backports.os 0.1.1 py37_0 defaults backports.shutil_get_terminal_size 1.0.0 py37_2 defaults beautifulsoup4 4.7.1 py37_1 defaults bitarray 0.8.3 py37h14c3975_0 defaults bkcharts 0.2 py37_0 defaults blas 1.0 openblas defaults bleach 3.1.0 py37_0 defaults blosc 1.15.0 hd408876_0 defaults bokeh 1.0.4 py37_0 defaults boto 2.49.0 py37_0 defaults bottleneck 1.2.1 py37h035aef0_1 defaults bzip2 1.0.6 h14c3975_5 defaults ca-certificates 2019.1.23 0 defaults cairo 1.14.12 h8948797_3 defaults certifi 2019.3.9 py37_0 defaults cffi 1.12.2 py37h2e261b9_1 defaults chardet 3.0.4 py37_1 defaults click 7.0 py37_0 defaults cloudpickle 0.8.0 py37_0 defaults clyent 1.2.2 py37_1 defaults colorama 0.4.1 py37_0 defaults conda 4.6.11 py37_0 defaults conda-build 3.17.8 py37_0 defaults conda-env 2.6.0 1 defaults conda-verify 3.1.1 py37_0 defaults contextlib2 0.5.5 py37_0 defaults cryptography 2.6.1 py37h1ba5d50_0 defaults curl 7.64.0 hbc83047_2 defaults cycler 0.10.0 py37_0 defaults cython 0.29.6 py37he6710b0_0 defaults cytoolz 0.9.0.1 py37h14c3975_1 defaults dask 1.1.4 py37_1 defaults dask-core 1.1.4 py37_1 defaults dbus 1.13.6 h746ee38_0 defaults decorator 4.4.0 py37_1 defaults defusedxml 0.5.0 py37_1 defaults distributed 1.26.0 py37_1 defaults docutils 0.14 py37_0 defaults entrypoints 0.3 py37_0 defaults et_xmlfile 1.0.1 py37_0 defaults expat 2.2.6 he6710b0_0 defaults fastcache 1.0.2 py37h14c3975_2 defaults filelock 3.0.10 py37_0 defaults flask 1.0.2 py37_1 defaults fontconfig 2.13.0 h9420a91_0 defaults freetype 2.9.1 h8a8886c_1 defaults fribidi 1.0.5 h7b6447c_0 defaults future 0.17.1 py37_0 defaults get_terminal_size 1.0.0 haa9412d_0 defaults gevent 1.4.0 py37h7b6447c_0 defaults glib 2.56.2 hd408876_0 defaults glob2 0.6 py37_1 defaults gmp 6.1.2 h6c8ec71_1 defaults gmpy2 2.0.8 py37h10f8cd9_2 defaults graphite2 1.3.13 h23475e2_0 defaults greenlet 0.4.15 py37h7b6447c_0 defaults gst-plugins-base 1.14.0 hbbd80ab_1 defaults gstreamer 1.14.0 hb453b48_1 defaults h5py 2.9.0 py37h7918eee_0 defaults harfbuzz 1.8.8 hffaf4a1_0 defaults hdf5 1.10.4 hb1b8bf9_0 defaults heapdict 1.0.0 py37_2 defaults holoviews 1.11.3 py_0 anaconda html5lib 1.0.1 py37_0 defaults hvplot 0.4.0 py_0 pyviz icu 58.2 h9c2bf20_1 defaults idna 2.8 py37_0 defaults igraph 0.7.1 h2166141_1005 conda-forge imageio 2.5.0 py37_0 defaults imagesize 1.1.0 py37_0 defaults importlib_metadata 0.8 py37_0 defaults intel-openmp 2019.3 199 defaults ipykernel 5.1.0 py37h39e3cac_0 defaults ipython 7.1.1 py37h39e3cac_0 anaconda ipython_genutils 0.2.0 py37_0 defaults ipywidgets 7.4.2 py37_0 defaults isort 4.3.16 py37_0 defaults itsdangerous 1.1.0 py37_0 defaults jbig 2.1 hdba287a_0 defaults jdcal 1.4 py37_0 defaults jedi 0.13.3 py37_0 defaults jeepney 0.4 py37_0 defaults jinja2 2.10 py37_0 defaults joblib 0.13.2 py37_0 anaconda jpeg 9b h024ee3a_2 defaults jsonschema 3.0.1 py37_0 defaults jupyter 1.0.0 py37_7 defaults jupyter_client 5.2.4 py37_0 defaults jupyter_console 6.0.0 py37_0 defaults jupyter_contrib_core 0.3.3 py_2 conda-forge jupyter_core 4.4.0 py37_0 defaults jupyter_highlight_selected_word 0.2.0 py37_1000 conda-forge jupyter_latex_envs 1.4.4 py37_1000 conda-forge jupyterlab 0.35.4 py37hf63ae98_0 defaults jupyterlab_server 0.2.0 py37_0 defaults keyring 18.0.0 py37_0 defaults kiwisolver 1.0.1 py37hf484d3e_0 defaults krb5 1.16.1 h173b8e3_7 defaults lazy-object-proxy 1.3.1 py37h14c3975_2 defaults libarchive 3.3.3 h5d8350f_5 defaults libcurl 7.64.0 h20c2e04_2 defaults libedit 3.1.20181209 hc058e9b_0 defaults libffi 3.2.1 hd88cf55_4 defaults libgcc-ng 8.2.0 hdf63c60_1 defaults libgfortran-ng 7.3.0 hdf63c60_0 defaults libiconv 1.15 h516909a_1005 conda-forge liblief 0.9.0 h7725739_2 defaults libopenblas 0.3.3 h5a2b251_3 defaults libpng 1.6.36 hbc83047_0 defaults libsodium 1.0.16 h1bed415_0 defaults libssh2 1.8.0 h1ba5d50_4 defaults libstdcxx-ng 8.2.0 hdf63c60_1 defaults libtiff 4.0.10 h2733197_2 defaults libtool 2.4.6 h7b6447c_5 defaults libuuid 1.0.3 h1bed415_2 defaults libxcb 1.13 h1bed415_1 defaults libxml2 2.9.9 he19cac6_0 defaults libxslt 1.1.33 h7d1a2b0_0 defaults llvmlite 0.28.0 py37hd408876_0 defaults locket 0.2.0 py37_1 defaults lxml 4.3.2 py37hefd8a0e_0 defaults lz4-c 1.8.1.2 h14c3975_0 defaults lzo 2.10 h49e0be7_2 defaults markupsafe 1.1.1 py37h7b6447c_0 defaults matplotlib 3.0.3 py37h5429711_0 defaults mccabe 0.6.1 py37_1 defaults mistune 0.8.4 py37h7b6447c_0 defaults more-itertools 6.0.0 py37_0 defaults mpc 1.1.0 h10f8cd9_1 defaults mpfr 4.0.1 hdf1c602_3 defaults mpmath 1.1.0 py37_0 defaults msgpack-python 0.6.1 py37hfd86e86_1 defaults multipledispatch 0.6.0 py37_0 defaults navigator-updater 0.2.1 py37_0 defaults nbconvert 5.4.1 py37_3 defaults nbformat 4.4.0 py37_0 defaults ncurses 6.1 he6710b0_1 defaults networkx 2.2 py37_1 defaults nltk 3.4 py37_1 defaults nomkl 3.0 0 defaults nose 1.3.7 py37_2 defaults notebook 5.7.8 py37_0 defaults numba 0.43.1 py37h962f231_0 defaults numexpr 2.6.9 py37h2ffa06c_0 defaults numpy 1.16.2 py37h99e49ec_0 defaults numpy-base 1.16.2 py37h2f8d375_0 defaults numpydoc 0.8.0 py37_0 defaults olefile 0.46 py37_0 defaults openpyxl 2.6.1 py37_1 defaults openssl 1.1.1b h7b6447c_1 defaults packaging 19.0 py37_0 defaults pandas 0.24.2 py37he6710b0_0 defaults pandoc 2.2.3.2 0 defaults pandocfilters 1.4.2 py37_1 defaults pango 1.42.4 h049681c_0 defaults param 1.8.2 py_0 anaconda parso 0.3.4 py37_0 defaults partd 0.3.10 py37_1 defaults patchelf 0.9 he6710b0_3 defaults path.py 11.5.0 py37_0 defaults pathlib2 2.3.3 py37_0 defaults patsy 0.5.1 py37_0 defaults pcre 8.43 he6710b0_0 defaults pep8 1.7.1 py37_0 defaults pexpect 4.6.0 py37_0 defaults pickleshare 0.7.5 py37_0 defaults pillow 5.4.1 py37h34e0f95_0 defaults pip 19.0.3 py37_0 defaults pixman 0.38.0 h7b6447c_0 defaults pkginfo 1.5.0.1 py37_0 defaults pluggy 0.9.0 py37_0 defaults ply 3.11 py37_0 defaults prometheus_client 0.6.0 py37_0 defaults prompt_toolkit 2.0.9 py37_0 defaults psutil 5.6.1 py37h7b6447c_0 defaults ptyprocess 0.6.0 py37_0 defaults py 1.8.0 py37_0 defaults py-lief 0.9.0 py37h7725739_2 defaults pycairo 1.18.0 py37h1b9232e_1000 conda-forge pycodestyle 2.5.0 py37_0 defaults pycosat 0.6.3 py37h14c3975_0 defaults pycparser 2.19 py37_0 defaults pycrypto 2.6.1 py37h14c3975_9 defaults pycurl 7.43.0.2 py37h1ba5d50_0 defaults pyflakes 2.1.1 py37_0 defaults pygments 2.3.1 py37_0 defaults pylint 2.3.1 py37_0 defaults pyodbc 4.0.26 py37he6710b0_0 defaults pyopenssl 19.0.0 py37_0 defaults pyparsing 2.3.1 py37_0 defaults pyqt 5.9.2 py37h05f1152_2 defaults pyrsistent 0.14.11 py37h7b6447c_0 defaults pysocks 1.6.8 py37_0 defaults pytables 3.5.1 py37h71ec239_0 defaults pytest 4.3.1 py37_0 defaults pytest-arraydiff 0.3 py37h39e3cac_0 defaults pytest-astropy 0.5.0 py37_0 defaults pytest-doctestplus 0.3.0 py37_0 defaults pytest-openfiles 0.3.2 py37_0 defaults pytest-remotedata 0.3.1 py37_0 defaults python 3.7.3 h0371630_0 defaults python-dateutil 2.8.0 py37_0 defaults python-igraph 0.7.1.post7 py37h516909a_0 conda-forge python-libarchive-c 2.8 py37_6 defaults pytz 2018.9 py37_0 defaults pyviz_comms 0.7.0 py37_0 anaconda pywavelets 1.0.2 py37hdd07704_0 defaults pyyaml 5.1 py37h7b6447c_0 defaults pyzmq 18.0.0 py37he6710b0_0 defaults qt 5.9.7 h5867ecd_1 defaults qtawesome 0.5.7 py37_1 defaults qtconsole 4.4.3 py37_0 defaults qtpy 1.7.0 py37_1 defaults readline 7.0 h7b6447c_5 defaults requests 2.21.0 py37_0 defaults rope 0.12.0 py37_0 defaults ruamel_yaml 0.15.46 py37h14c3975_0 defaults scikit-image 0.14.2 py37he6710b0_0 defaults scikit-learn 0.20.3 py37h22eb022_0 defaults scipy 1.2.1 py37he2b7bc3_0 defaults seaborn 0.9.0 py37_0 defaults secretstorage 3.1.1 py37_0 defaults send2trash 1.5.0 py37_0 defaults setuptools 40.8.0 py37_0 defaults simplegeneric 0.8.1 py37_2 defaults singledispatch 3.4.0.3 py37_0 defaults sip 4.19.8 py37hf484d3e_0 defaults six 1.12.0 py37_0 defaults snappy 1.1.7 hbae5bb6_3 defaults snowballstemmer 1.2.1 py37_0 defaults sortedcollections 1.1.2 py37_0 defaults sortedcontainers 2.1.0 py37_0 defaults soupsieve 1.8 py37_0 defaults sphinx 1.8.5 py37_0 defaults sphinxcontrib 1.0 py37_1 defaults sphinxcontrib-websupport 1.1.0 py37_1 defaults spyder 3.3.3 py37_0 defaults spyder-kernels 0.4.2 py37_0 defaults sqlalchemy 1.3.1 py37h7b6447c_0 defaults sqlite 3.27.2 h7b6447c_0 defaults statsmodels 0.9.0 py37h035aef0_0 defaults sympy 1.3 py37_0 defaults tblib 1.3.2 py37_0 defaults terminado 0.8.1 py37_1 defaults testpath 0.4.2 py37_0 defaults tk 8.6.8 hbc83047_0 defaults toolz 0.9.0 py37_0 defaults tornado 6.0.2 py37h7b6447c_0 defaults tqdm 4.31.1 py37_1 defaults traitlets 4.3.2 py37_0 defaults unicodecsv 0.14.1 py37_0 defaults unixodbc 2.3.7 h14c3975_0 defaults urllib3 1.24.1 py37_0 defaults wcwidth 0.1.7 py37_0 defaults webencodings 0.5.1 py37_1 defaults werkzeug 0.14.1 py37_0 defaults wheel 0.33.1 py37_0 defaults widgetsnbextension 3.4.2 py37_0 defaults wrapt 1.11.1 py37h7b6447c_0 defaults wurlitzer 1.0.2 py37_0 defaults xlrd 1.2.0 py37_0 defaults xlsxwriter 1.1.5 py37_0 defaults xlwt 1.3.0 py37_0 defaults xz 5.2.4 h14c3975_4 defaults yaml 0.1.7 had09818_2 defaults zeromq 4.3.1 he6710b0_3 defaults zict 0.1.4 py37_0 defaults zipp 0.3.3 py37_1 defaults zlib 1.2.11 h7b6447c_3 defaults zstd 1.3.7 h0b5b093_0 defaults ```
schmidtchristoph commented 5 years ago

I encountered the very same bug.

A simple import of numpy is often sufficient to trigger this bug.

ll-portes commented 5 years ago

@schmidtchristoph I installed the Pyviz environment, and I got no bug. Unfortunately, the problem returned after installing scikit-learning on this env. Hence, my pragmatic solution now is to have a clone PyViz env to make my work. If I need to install something, I clone the env again and install the package. If this installation breaks my solution, I return to the original env. Maybe this approach could help you.

msarahan commented 5 years ago

@oleksandr-pavlyk do you have any ideas on this one?

oleksandr-pavlyk commented 5 years ago

I will try to locate the machine to triage the issue.

If the reporter is willing to try a few things, please try whether the following helps:

MKL_ENABLE_INSTRUCTIONS=AVX2 python script.py  # MKL will avoid AVX512 instructions

or

MKL_THREADING_LAYER=SEQUENTIAL python script.py  # MKL will use one core only
schmidtchristoph commented 5 years ago

Both commands help to avoid the system freeze and subsequent reboot.

I used the dot product example of @ll-portes as "script.py". This example would reproduce the bug if no MKL_* instructions are prepended.

Thank you very much.

ll-portes commented 5 years ago

I will try to locate the machine to triage the issue.

If the reporter is willing to try a few things, please try whether the following helps:

MKL_ENABLE_INSTRUCTIONS=AVX2 python script.py  # MKL will avoid AVX512 instructions

or

MKL_THREADING_LAYER=SEQUENTIAL python script.py  # MKL will use one core only

None of the commands above worked. The machine froze/rebooted again.

ll-portes commented 5 years ago

I don't know if this information helps, but under the PyViz environment, everything works fine. This environment was created using:

conda update conda
conda create -n pyviz-tutorial python=3.6

After activating it, I ran the conda update --all. Note: after that, I cloned this same env under the different name "PyVizEnv", which appears below. On this PyVizEnv environment, the outputs for conda are:

conda info
``` active environment : PyVizEnv active env location : /home/leo/anaconda3/envs/PyVizEnv shell level : 2 user config file : /home/leo/.condarc populated config files : /home/leo/.condarc conda version : 4.6.12 conda-build version : 3.17.8 python version : 3.7.3.final.0 base environment : /home/leo/anaconda3 (writable) channel URLs : https://repo.anaconda.com/pkgs/main/linux-64 https://repo.anaconda.com/pkgs/main/noarch https://repo.anaconda.com/pkgs/free/linux-64 https://repo.anaconda.com/pkgs/free/noarch https://repo.anaconda.com/pkgs/r/linux-64 https://repo.anaconda.com/pkgs/r/noarch package cache : /home/leo/anaconda3/pkgs /home/leo/.conda/pkgs envs directories : /home/leo/anaconda3/envs /home/leo/.conda/envs platform : linux-64 user-agent : conda/4.6.12 requests/2.21.0 CPython/3.7.3 Linux/4.15.0-47-generic ubuntu/18.04.2 glibc/2.27 UID:GID : 1000:1000 netrc file : None offline mode : False ```
conda list --show-channel-urls
``` # packages in environment at /home/leo/anaconda3/envs/PyVizEnv: # # Name Version Build Channel asn1crypto 0.24.0 py36_0 defaults atomicwrites 1.3.0 py36_1 defaults attrs 19.1.0 py36_1 defaults backcall 0.1.0 py36_0 defaults blas 1.0 mkl defaults bleach 3.1.0 py36_0 defaults bokeh 1.1.0 py36_0 defaults bzip2 1.0.6 h14c3975_5 defaults ca-certificates 2019.1.23 0 defaults cairo 1.14.12 h7636065_2 defaults cartopy 0.16.0 py36hfa13621_0 defaults certifi 2019.3.9 py36_0 defaults cffi 1.12.2 py36h2e261b9_1 defaults cftime 1.0.3.4 py36hdd07704_0 defaults chardet 3.0.4 py36_1 defaults click 7.0 py36_0 defaults click-plugins 1.0.4 py36_0 defaults cligj 0.5.0 py36_0 defaults cloudpickle 0.8.1 py_0 defaults colorcet 2.0.1 py_0 pyviz/label/dev cryptography 2.3.1 py36hc365091_0 defaults curl 7.61.0 h84994c4_0 defaults cycler 0.10.0 py36_0 defaults cytoolz 0.9.0.1 py36h14c3975_1 defaults dask 1.2.0 py_0 defaults dask-core 1.2.0 py_0 defaults datashader 0.7.0 py_0 pyviz/label/dev datashape 0.5.4 py36_1 defaults dbus 1.13.6 h746ee38_0 defaults decorator 4.4.0 py36_1 defaults defusedxml 0.5.0 py36_1 defaults descartes 1.1.0 py36_0 defaults distributed 1.27.0 py36_0 defaults entrypoints 0.3 py36_0 defaults expat 2.2.6 he6710b0_0 defaults fastparquet 0.2.1 py36hdd07704_1 defaults fiona 1.7.12 py36h3f37509_0 defaults fontconfig 2.12.6 h49f89f6_0 defaults freetype 2.8 hab7d2ae_1 defaults freexl 1.0.5 h14c3975_0 defaults gdal 2.2.2 py36hc209d97_1 defaults geopandas 0.4.1 py_0 defaults geos 3.6.2 heeff764_2 defaults geoviews 1.6.3a2 py_0 pyviz/label/dev geoviews-core 1.6.3a2 py_0 pyviz/label/dev giflib 5.1.4 h14c3975_1 defaults glib 2.56.2 hd408876_0 defaults gmp 6.1.2 h6c8ec71_1 defaults gst-plugins-base 1.14.0 hbbd80ab_1 defaults gstreamer 1.14.0 hb453b48_1 defaults hdf4 4.2.13 h3ca952b_2 defaults hdf5 1.10.2 hba1933b_1 defaults heapdict 1.0.0 py36_2 defaults holoviews 1.12.1 py_0 pyviz/label/dev hvplot 0.4.0 py_0 pyviz/label/dev icu 58.2 h9c2bf20_1 defaults idna 2.8 py36_0 defaults imageio 2.5.0 py36_0 defaults intel-openmp 2019.3 199 defaults ipykernel 5.1.0 py36h39e3cac_0 defaults ipython 7.1.1 py36h39e3cac_0 defaults ipython_genutils 0.2.0 py36_0 defaults ipywidgets 7.4.2 py36_0 defaults jedi 0.13.3 py36_0 defaults jinja2 2.10.1 py36_0 defaults jpeg 9b h024ee3a_2 defaults json-c 0.13.1 h1bed415_0 defaults jsonschema 3.0.1 py36_0 defaults jupyter 1.0.0 py36_7 defaults jupyter_client 5.2.4 py36_0 defaults jupyter_console 6.0.0 py36_0 defaults jupyter_core 4.4.0 py36_0 defaults kealib 1.4.7 h77bc034_6 defaults kiwisolver 1.0.1 py36hf484d3e_0 defaults krb5 1.16.1 hc83ff2d_6 defaults libboost 1.67.0 h46d08c1_4 defaults libcurl 7.61.0 h1ad7b7a_0 defaults libdap4 3.19.1 h6ec2957_0 defaults libedit 3.1.20181209 hc058e9b_0 defaults libffi 3.2.1 hd88cf55_4 defaults libgcc-ng 8.2.0 hdf63c60_1 defaults libgdal 2.2.4 h6f639c0_1 defaults libgfortran-ng 7.3.0 hdf63c60_0 defaults libkml 1.3.0 h590aaf7_4 defaults libnetcdf 4.6.1 h10edf3e_1 defaults libpng 1.6.36 hbc83047_0 defaults libpq 10.5 h1ad7b7a_0 defaults libsodium 1.0.16 h1bed415_0 defaults libspatialindex 1.8.5 h20b78c2_2 defaults libspatialite 4.3.0a he475c7f_19 defaults libssh2 1.8.0 h9cfc8f7_4 defaults libstdcxx-ng 8.2.0 hdf63c60_1 defaults libtiff 4.0.10 h2733197_2 defaults libuuid 1.0.3 h1bed415_2 defaults libxcb 1.13 h1bed415_1 defaults libxml2 2.9.9 he19cac6_0 defaults libxslt 1.1.33 h7d1a2b0_0 defaults llvmlite 0.28.0 py36hd408876_0 defaults locket 0.2.0 py36_1 defaults lxml 4.3.3 py36hefd8a0e_0 defaults mapclassify 2.0.1 py_0 defaults markdown 3.0.1 py36_0 defaults markupsafe 1.1.1 py36h7b6447c_0 defaults matplotlib 2.2.2 py36h0e671d2_1 defaults mistune 0.8.4 py36h7b6447c_0 defaults mkl 2019.3 199 defaults mkl_fft 1.0.10 py36ha843d7b_0 defaults mkl_random 1.0.2 py36hd81dba3_0 defaults more-itertools 6.0.0 py36_0 defaults msgpack-python 0.6.1 py36hfd86e86_1 defaults multipledispatch 0.6.0 py36_0 defaults munch 2.3.2 py36_0 defaults nbconvert 5.4.1 py36_3 defaults nbformat 4.4.0 py36_0 defaults ncurses 6.1 he6710b0_1 defaults netcdf4 1.4.2 py36h4b4f87f_0 defaults networkx 2.3 py_0 defaults notebook 5.7.8 py36_0 defaults numba 0.43.1 py36h962f231_0 defaults numpy 1.16.2 py36h7e9f1db_0 defaults numpy-base 1.16.2 py36hde5b4d6_0 defaults olefile 0.46 py36_0 defaults openjpeg 2.3.0 h05c96fa_1 defaults openssl 1.0.2r h7b6447c_0 defaults owslib 0.17.1 py_0 defaults packaging 19.0 py36_0 defaults pandas 0.24.2 py36he6710b0_0 defaults pandoc 2.2.3.2 0 defaults pandocfilters 1.4.2 py36_1 defaults panel 0.5.1 py_0 pyviz/label/dev param 1.9.0 py_0 pyviz/label/dev parso 0.4.0 py_0 defaults partd 0.3.10 py36_1 defaults pcre 8.43 he6710b0_0 defaults pexpect 4.7.0 py36_0 defaults phantomjs 2.1.1 0 pyviz/label/dev pickleshare 0.7.5 py36_0 defaults pillow 5.1.0 py36h3deb7b8_0 defaults pip 19.0.3 py36_0 defaults pixman 0.38.0 h7b6447c_0 defaults pluggy 0.9.0 py36_0 defaults poppler 0.65.0 ha54bb34_0 defaults poppler-data 0.4.9 0 defaults proj4 5.0.1 h14c3975_0 defaults prometheus_client 0.6.0 py36_0 defaults prompt_toolkit 2.0.9 py36_0 defaults psutil 5.6.1 py36h7b6447c_0 defaults psycopg2 2.7.5 py36hb7f436b_0 defaults ptyprocess 0.6.0 py36_0 defaults py 1.8.0 py36_0 defaults pycparser 2.19 py36_0 defaults pyct 0.4.6 py_0 pyviz/label/dev pyct-core 0.4.6 py_0 pyviz/label/dev pyepsg 0.4.0 py36_0 defaults pygments 2.3.1 py36_0 defaults pyopenssl 19.0.0 py36_0 defaults pyparsing 2.4.0 py_0 defaults pyproj 1.9.5.1 py36h7b21b82_1 defaults pyqt 5.9.2 py36h751905a_0 defaults pyrsistent 0.14.11 py36h7b6447c_0 defaults pyshp 2.0.1 py36_0 defaults pysocks 1.6.8 py36_0 defaults pytest 4.4.0 py36_1 defaults python 3.6.6 h6e4f718_2 defaults python-dateutil 2.8.0 py36_0 defaults python-snappy 0.5.3 py36he6710b0_0 defaults pytz 2019.1 py_0 defaults pyviz 0.10.0 py_0 pyviz/label/dev pyviz_comms 0.7.2 py_0 pyviz/label/dev pywavelets 1.0.3 py36hdd07704_1 defaults pyyaml 5.1 py36h7b6447c_0 defaults pyzmq 18.0.0 py36he6710b0_0 defaults qt 5.9.5 h7e424d6_0 defaults qtconsole 4.4.3 py36_0 defaults readline 7.0 h7b6447c_5 defaults requests 2.21.0 py36_0 defaults rise 5.3.0 py36_0 pyviz/label/dev rtree 0.8.3 py36_0 defaults scikit-image 0.14.2 py36he6710b0_0 defaults scipy 1.2.1 py36h7c811a0_0 defaults selenium 3.141.0 py36h7b6447c_0 defaults send2trash 1.5.0 py36_0 defaults setuptools 41.0.0 py36_0 defaults shapely 1.6.4 py36h7ef4460_0 defaults sip 4.19.8 py36hf484d3e_0 defaults six 1.12.0 py36_0 defaults snappy 1.1.7 hbae5bb6_3 defaults sortedcontainers 2.1.0 py36_0 defaults sqlalchemy 1.3.3 py36h7b6447c_0 defaults sqlite 3.27.2 h7b6447c_0 defaults streamz 0.5.0 py36_0 defaults tblib 1.3.2 py36_0 defaults terminado 0.8.2 py36_0 defaults testpath 0.3.1 py36_0 defaults thrift 0.11.0 py36hf484d3e_0 defaults tk 8.6.8 hbc83047_0 defaults toolz 0.9.0 py36_0 defaults tornado 6.0.2 py36h7b6447c_0 defaults traitlets 4.3.2 py36_0 defaults urllib3 1.24.1 py36_0 defaults wcwidth 0.1.7 py36_0 defaults webencodings 0.5.1 py36_1 defaults wheel 0.33.1 py36_0 defaults widgetsnbextension 3.4.2 py36_0 defaults xarray 0.11.3 py36_0 defaults xerces-c 3.2.2 h780794e_0 defaults xz 5.2.4 h14c3975_4 defaults yaml 0.1.7 had09818_2 defaults zeromq 4.3.1 he6710b0_3 defaults zict 0.1.4 py36_0 defaults zlib 1.2.11 h7b6447c_3 defaults zstd 1.3.7 h0b5b093_0 defaults ```
oleksandr-pavlyk commented 5 years ago

@ll-portes Please note that the pyviz-tutorial environment has

blas                      1.0                         mkl    defaults

whereas the (I assumed, the problematic) environment in your original post has

blas                      1.0                    openblas    defaults

If the numpy in the problematic environment has been linked against mkl, rather than openblas that could be a problem. You can check that by activating the problematic environment and running (suitably adjusted for your python version, I am using Python 3.6):

ldd $CONDA_PREFIX/lib/python3.6/site-packages/numpy/core/_multiarray_umath.cpython-36m-x86_64-linux-gnu.so

Alternatively you can inspect the content of $CONDA_PREFIX/lib/python3.6/site-packages/numpy/distutils/site.cfg.

Please try to see if the crash persists in a simpler environment:

conda create -n t_i10832 -c defaults --override-channels numpy
conda activate t_i10832
python script.py
ll-portes commented 5 years ago

Thank you @oleksandr-pavlyk The whole story is a little bit confusing, but I'll try my best to explain. First, let me show the results using the t_i10832 env. I always test two scripts:

script_dot.py is:

import numpy as np
print(np.__version__)

A = np.matrix([[1.], [3.]]); B = np.matrix([[2., 3.]])
C=np.dot(A, B)

print(C.shape)

script_svd.py is:

import numpy as np

print(np.__version__)

nr=1000;nc=10000

X=np.random.rand(nr,nc)
print(X.shape)

u,s,vt=np.linalg.svd(X)

print(u.shape)

The results with t_i10832 env are:

These are the outputs for the t_i10832 env:

conda info
``` active environment : t_i10832 active env location : /home/leo/anaconda3/envs/t_i10832 shell level : 2 user config file : /home/leo/.condarc populated config files : /home/leo/.condarc conda version : 4.6.12 conda-build version : 3.17.8 python version : 3.7.3.final.0 base environment : /home/leo/anaconda3 (writable) channel URLs : https://repo.anaconda.com/pkgs/main/linux-64 https://repo.anaconda.com/pkgs/main/noarch https://repo.anaconda.com/pkgs/free/linux-64 https://repo.anaconda.com/pkgs/free/noarch https://repo.anaconda.com/pkgs/r/linux-64 https://repo.anaconda.com/pkgs/r/noarch package cache : /home/leo/anaconda3/pkgs /home/leo/.conda/pkgs envs directories : /home/leo/anaconda3/envs /home/leo/.conda/envs platform : linux-64 user-agent : conda/4.6.12 requests/2.21.0 CPython/3.7.3 Linux/4.15.0-47-generic ubuntu/18.04.2 glibc/2.27 UID:GID : 1000:1000 netrc file : None offline mode : False ```
conda list --show-channel-urls
``` # packages in environment at /home/leo/anaconda3/envs/t_i10832: # # Name Version Build Channel blas 1.0 mkl defaults ca-certificates 2019.1.23 0 defaults certifi 2019.3.9 py37_0 defaults intel-openmp 2019.3 199 defaults libedit 3.1.20181209 hc058e9b_0 defaults libffi 3.2.1 hd88cf55_4 defaults libgcc-ng 8.2.0 hdf63c60_1 defaults libgfortran-ng 7.3.0 hdf63c60_0 defaults libstdcxx-ng 8.2.0 hdf63c60_1 defaults mkl 2019.3 199 defaults mkl_fft 1.0.10 py37ha843d7b_0 defaults mkl_random 1.0.2 py37hd81dba3_0 defaults ncurses 6.1 he6710b0_1 defaults numpy 1.16.3 py37h7e9f1db_0 defaults numpy-base 1.16.3 py37hde5b4d6_0 defaults openssl 1.1.1b h7b6447c_1 defaults pip 19.0.3 py37_0 defaults python 3.7.3 h0371630_0 defaults readline 7.0 h7b6447c_5 defaults setuptools 41.0.0 py37_0 defaults sqlite 3.28.0 h7b6447c_0 defaults tk 8.6.8 hbc83047_0 defaults wheel 0.33.1 py37_0 defaults xz 5.2.4 h14c3975_4 defaults zlib 1.2.11 h7b6447c_3 defaults ```

Now, the little bit confusing story. Originally, I found this problem with:

  1. Anaconda 5.3.0 (Sept 28, 2018), installed on a new machine (this 18 cores).
  2. A script with numpy.linalg.svd

Then, I tryied "combinations" of: a. updating/downgrading conda, anaconda, mkl. b. nonmkl versions of numpy (scipy etc). c. remark: even the Anaconda 2019.03 had this problem.

The "solution" for me was to use the same approch in this (link), on which the example code was with numpy.dot (so, just for consistence, I started to report the problem with np.dot instead of np.linalg.svd, since both crashed my computer. But I always test the solutions with both scripts) . Specifically, this partial solution was: i. install nonmkl versions of numpy (scipy etc). ii. use os.environ['OPENBLAS_CORETYPE']='Haswell'

Now, using t_i10832 env, this was the first time that one code worked (dot) and the other doesn't (svd).

So, since the beginning, the problem was with Numpy + blas +mkl (original Anaconda installation), and the problem persisted using openblas (but with it, at least the os.environ thing allowed me to use the machine). I found here on my PC the following information from my first trials in solving this, but with no success, by updating things (they are outputs of conda list, but I saved just info regarding blas and mkl):

blas                      1.0                         mkl  
mkl                       2019.1                      144  
mkl-service               1.1.2            py37he904b0f_5  
mkl_fft                   1.0.6            py37hd81dba3_0  
mkl_random                1.0.2            py37hd81dba3_0  
blas                      1.0                         mkl  
mkl                       2019.3                      199  
mkl-service               1.1.2            py37he904b0f_5  
mkl_fft                   1.0.10           py37ha843d7b_0  
mkl_random                1.0.2            py37hd81dba3_0
ll-portes commented 5 years ago

Sorry, I clicked the "close and comment" instead of "Comment" by mistake.

oleksandr-pavlyk commented 5 years ago

@ll-portes Thank you for trying this. So we established that the environment is consistent, but a call to SVD is causing trouble.

Unfortunately I was not able to get ahold of the machine with the processor you are using yet, so I have to ask you to try different things in the hope to triage the problem further.

So first question, is the numpy_svd.py script working for smaller matrix sizes ?

If you further install scipy from the defaults channels into your environment with conda install -n t_i10832 -c defaults --override-channels scipy, and try a different Lapack driver to solve the SVD specified via scipy.linalg.svd:

u3, s3, vt3 = scipy.linalg.svd(X, lapack_driver='gesvd')

does the problem go away?

In your experiments, please fix the random seed to ensure reproducibility on our side:

np.random.seed(42)
X = np.random.rand(nr, nc)
oleksandr-pavlyk commented 5 years ago

@ll-portes I was finally able to secure access to the hardware, but I am unable to reproduce any problems:

(numpy) C:\Users\user>ipython
Python 3.6.8 |Anaconda, Inc.| (default, Feb 21 2019, 18:30:04) [MSC v.1916 64 bit (AMD64)]
Type 'copyright', 'credits' or 'license' for more information
IPython 7.5.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import numexpr, numpy as np, mkl_random, mkl

In [2]: from numexpr.cpuinfo import cpu

In [3]: len(cpu.info)
Out[3]: 36

In [4]: cpu.info[0]['ProcessorNameString']
Out[4]: 'Intel(R) Core(TM) i9-7980XE CPU @ 2.60GHz'

In [5]: mkl.get_version_string()
Out[5]: 'Intel(R) Math Kernel Library Version 2019.0.3 Product Build 20190125 for Intel(R) 64 architecture applications'

In [6]: np.__version__
Out[6]: '1.16.3'

In [7]: nr, nc = 1000, 10**4

In [8]: XX = mkl_random.randn(nr, nc)

In [9]: U, S, Vt = np.linalg.svd(XX)

In [10]: (U.shape, S.shape, Vt.shape)
Out[10]: ((1000, 1000), (1000,), (10000, 10000))

In [11]: X = mkl_random.randn(nc, nr)

In [12]: U, S, Vt = np.linalg.svd(X)

In [13]: (U.shape, S.shape, Vt.shape)
Out[13]: ((10000, 10000), (1000,), (1000, 1000))

In [14]: quit
conda info

``` active environment : numpy active env location : C:\Users\user\Miniconda3\envs\numpy shell level : 2 user config file : C:\Users\user\.condarc populated config files : conda version : 4.6.14 conda-build version : not installed python version : 3.7.3.final.0 base environment : C:\Users\user\Miniconda3 (writable) channel URLs : https://repo.anaconda.com/pkgs/main/win-64 https://repo.anaconda.com/pkgs/main/noarch https://repo.anaconda.com/pkgs/free/win-64 https://repo.anaconda.com/pkgs/free/noarch https://repo.anaconda.com/pkgs/r/win-64 https://repo.anaconda.com/pkgs/r/noarch https://repo.anaconda.com/pkgs/msys2/win-64 https://repo.anaconda.com/pkgs/msys2/noarch package cache : C:\Users\user\Miniconda3\pkgs C:\Users\user\.conda\pkgs C:\Users\user\AppData\Local\conda\conda\pkgs envs directories : C:\Users\user\Miniconda3\envs C:\Users\user\.conda\envs C:\Users\user\AppData\Local\conda\conda\envs platform : win-64 user-agent : conda/4.6.14 requests/2.21.0 CPython/3.7.3 Windows/10 Windows/10.0.17134 administrator : True netrc file : None offline mode : False ```

conda list --explicit

```text # This file may be used to create an environment using: # $ conda create --name --file # platform: win-64 @EXPLICIT https://repo.anaconda.com/pkgs/main/win-64/blas-1.0-mkl.tar.bz2 https://repo.anaconda.com/pkgs/main/win-64/icc_rt-2019.0.0-h0cc432a_1.tar.bz2 https://repo.anaconda.com/pkgs/main/win-64/intel-openmp-2019.3-203.tar.bz2 https://repo.anaconda.com/pkgs/main/win-64/vs2015_runtime-14.15.26706-h3a45250_4.tar.bz2 https://repo.anaconda.com/pkgs/main/win-64/mkl-2019.3-203.tar.bz2 https://repo.anaconda.com/pkgs/main/win-64/vc-14.1-h0510ff6_4.tar.bz2 https://repo.anaconda.com/pkgs/main/win-64/sqlite-3.28.0-he774522_0.tar.bz2 https://repo.anaconda.com/pkgs/main/win-64/python-3.6.8-h9f7ef89_7.tar.bz2 https://repo.anaconda.com/pkgs/main/win-64/backcall-0.1.0-py36_0.tar.bz2 https://repo.anaconda.com/pkgs/main/win-64/certifi-2019.3.9-py36_0.tar.bz2 https://repo.anaconda.com/pkgs/main/win-64/colorama-0.4.1-py36_0.tar.bz2 https://repo.anaconda.com/pkgs/main/win-64/decorator-4.4.0-py36_1.tar.bz2 https://repo.anaconda.com/pkgs/main/win-64/ipython_genutils-0.2.0-py36h3c5d0ee_0.tar.bz2 https://repo.anaconda.com/pkgs/main/win-64/mkl-service-1.1.2-py36hb782905_5.tar.bz2 https://repo.anaconda.com/pkgs/main/win-64/numpy-base-1.16.3-py36hc3f5095_0.tar.bz2 https://repo.anaconda.com/pkgs/main/noarch/parso-0.4.0-py_0.tar.bz2 https://repo.anaconda.com/pkgs/main/win-64/pickleshare-0.7.5-py36_0.tar.bz2 https://repo.anaconda.com/pkgs/main/win-64/psutil-5.6.2-py36he774522_0.tar.bz2 https://repo.anaconda.com/pkgs/main/win-64/six-1.12.0-py36_0.tar.bz2 https://repo.anaconda.com/pkgs/main/win-64/wcwidth-0.1.7-py36h3d5aa90_0.tar.bz2 https://repo.anaconda.com/pkgs/main/win-64/wincertstore-0.2-py36h7fe50ca_0.tar.bz2 https://repo.anaconda.com/pkgs/main/win-64/jedi-0.13.3-py36_0.tar.bz2 https://repo.anaconda.com/pkgs/main/win-64/mkl_random-1.0.2-py36h343c172_0.tar.bz2 https://repo.anaconda.com/pkgs/main/win-64/setuptools-41.0.1-py36_0.tar.bz2 https://repo.anaconda.com/pkgs/main/win-64/traitlets-4.3.2-py36h096827d_0.tar.bz2 https://repo.anaconda.com/pkgs/main/win-64/pygments-2.3.1-py36_0.tar.bz2 https://repo.anaconda.com/pkgs/main/win-64/wheel-0.33.1-py36_0.tar.bz2 https://repo.anaconda.com/pkgs/main/win-64/pip-19.1-py36_0.tar.bz2 https://repo.anaconda.com/pkgs/main/win-64/prompt_toolkit-2.0.9-py36_0.tar.bz2 https://repo.anaconda.com/pkgs/main/win-64/ipython-7.5.0-py36h39e3cac_0.tar.bz2 https://repo.anaconda.com/pkgs/main/win-64/mkl_fft-1.0.12-py36h14836fe_0.tar.bz2 https://repo.anaconda.com/pkgs/main/win-64/numpy-1.16.3-py36h19fb1c0_0.tar.bz2 https://repo.anaconda.com/pkgs/main/win-64/numexpr-2.6.9-py36hdce8814_0.tar.bz2 https://repo.anaconda.com/pkgs/main/win-64/scipy-1.2.1-py36h29ff71c_0.tar.bz2 ```

Hence it does not seem to be a problem in the build of NumPy, in the Intel(R) MKL itself.

The machine I used ran Windows 10, had 1 socket, 18 cores and hyperthreading on with 2 threads per core.

My realm of expertise ends here, but if you happened to overclock the processor, please try to run the workload in the normal mode.