LSSTDESC / NaMaster

A unified pseudo-Cl framework
BSD 3-Clause "New" or "Revised" License
56 stars 26 forks source link

Kernel Dying in Jupyter Notebook #65

Closed syasini closed 2 years ago

syasini commented 5 years ago

Hi,

I installed a fresh copy of pymaster using conda on python 3.7, and when I run a simple line like

nmt.synfast_flat(Nx, Ny, Lx, Ly, [Cl_TT],[0])

in Jupiter Notebook, I get the Kernel Restarting error:

The kernel appears to have died. It will restart automatically.

When I run the same code in iPython from the terminal it seems to run with no problem though.

Any ideas how I can fix this?

Thanks.

DanielLenz commented 5 years ago

What's the jupyter error message, i.e. what is printed in the terminal that runs your jupyter server?

syasini commented 5 years ago

Hey Daniel,

This is what I get in the terminal:

Adapting from protocol version 5.1 (kernel 7b812c0f-23d8-4dd2-9542-d8014eb96ca1) to 5.3 (client). OMP: Error #15: Initializing libomp.dylib, but found libiomp5.dylib already initialized. OMP: Hint: This means that multiple copies of the OpenMP runtime have been linked into the program. That is dangerous, since it can degrade performance or cause incorrect results. The best thing to do is to ensure that only a single OpenMP runtime is linked into the process, e.g. by avoiding static linking of the OpenMP runtime in any library. As an unsafe, unsupported, undocumented workaround you can set the environment variable KMP_DUPLICATE_LIB_OK=TRUE to allow the program to continue to execute, but that may cause crashes or silently produce incorrect results. For more information, please see http://www.intel.com/software/products/support/.

Is there a quick fix for this?

DanielLenz commented 5 years ago

I haven't experienced this issue myself, but quickly googling the error (e.g. https://github.com/dmlc/xgboost/issues/1715) suggests a couple of different approaches:

Good luck!

syasini commented 5 years ago

Thanks, Daniel. I did install pymaster using conda, but I'll definitely try your other suggestions and post the updates here soon.

Cheers

syasini commented 5 years ago

Just a quick update on this:

@DanielLenz: It seems like matplotlib was indeed the issue. The kernel only dies when I import both nmt and matplotlib together. I didn't try the different backends, but os.environ["KMP_DUPLICATE_LIB_OK"] = "True" resolves the issue.

Tagging @Lluism and @izzyswafford because we ran into a similar issue a while ago.

I'm still looking forward to comments from other people as well.

fjaviersanchez commented 5 years ago

I am also running into similar issues and if you import namaster first, the problem doesn't show (at least in my case).

syasini commented 5 years ago

@fjaviersanchez: Thanks for the update and sorry for my late reply. I just tested this and the problem still persists.

The kernel dies regardless of the import order for namaster and matplotlib.

fjaviersanchez commented 5 years ago

@syasini thanks for trying this workaround and reporting the results here.

olivierdore commented 4 years ago

Hello all, I am facing a somewhat similar issue and can't crack it so I was wondering if someone out there would have any tip. I installed Namaste from source and can test the C routines check just fine (make check) but running the python test crashed my python (very unusual) with a segfault. Below is the terminal output. It suggests something is off in my setup but it usually works well so I am confused. Any suggestion welcome and thanks for making this package public.

=== bash-3.2$ python -m unittest discover -v /opt/local/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/healpy/fitsfunc.py:351: UserWarning: If you are not specifying the input dtype and using the default np.float64 dtype of read_map(), please consider that it will change in a future version to None as to keep the same dtype of the input file: please explicitly set the dtype if it is important to you. warnings.warn( test_bins_flat_alloc (test.test_nmt_bins.TestBinsFsk) ... ok test_bins_flat_binning (test.test_nmt_bins.TestBinsFsk) ... ok test_bins_flat_errors (test.test_nmt_bins.TestBinsFsk) ... ok test_bins_binning (test.test_nmt_bins.TestBinsSph) ... ok test_bins_binning_f_ell (test.test_nmt_bins.TestBinsSph) ... ok test_bins_constant (test.test_nmt_bins.TestBinsSph) ... ok test_bins_edges (test.test_nmt_bins.TestBinsSph) ... ok test_bins_errors (test.test_nmt_bins.TestBinsSph) ... ok test_bins_variable (test.test_nmt_bins.TestBinsSph) ... ok test_workspace_covar_flat_benchmark (test.test_nmt_covar.TestCovarFsk) ... ok test_workspace_covar_flat_errors (test.test_nmt_covar.TestCovarFsk) ... ok test_workspace_covar_benchmark (test.test_nmt_covar.TestCovarSph) ... /opt/local/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/healpy/fitsfunc.py:351: UserWarning: If you are not specifying the input dtype and using the default np.float64 dtype of read_map(), please consider that it will change in a future version to None as to keep the same dtype of the input file: please explicitly set the dtype if it is important to you. warnings.warn( /opt/local/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/healpy/fitsfunc.py:351: UserWarning: If you are not specifying the input dtype and using the default np.float64 dtype of read_map(), please consider that it will change in a future version to None as to keep the same dtype of the input file: please explicitly set the dtype if it is important to you. warnings.warn( Segmentation fault: 11 bash-3.2$

fjaviersanchez commented 4 years ago

Hi @olivierdore, this looks like conflicting healpix-related libraries (maybe NaMaster is calling one healpix/libsharp version and healpy a different one?) because it breaks the first time it's calling healpy, essentially. I also thought it could be python3.8 problems but I just installed from source using py38 and it seems to work for me. Have you tried using pip install pymaster (this is likely still not going to fix it but who knows?) or if you use conda conda install pymaster -c conda-forge? @damonge, any ideas?

syasini commented 4 years ago

This does not exactly answer @olivierdore's question, but a good workaround solution for the problem would be to use the Docker image from here: https://github.com/simonsobs/PSpipe I hope this helps.

olivierdore commented 4 years ago

Thank you @fjaviersanchez and @syasini . I will fight a bit more before installing the Docker which I am sure would work. @fjaviersanchez I have been careful to link directly to my healpix/healpy installation (through MacPorts) and not reinstall it. For some reason, I do need however to reinstall libsharp as using my default installation (through MacPorts) leads to a segfault in the C "make check" (TEST 23/66 nmt:he_sht_car /bin/sh: line 1: 79903 Segmentation fault: 11 ${dir}$tst), whereas linking to a fresh install leads to the segfault in python... Very odd. But maybe libsharp is indeed the culprit...

fjaviersanchez commented 4 years ago

I think that NaMaster is still reinstalling healpix and libsharp. A hacky option is to copy or create a symlink of libchealpix.a and libsharp.a in the ./_deps/lib subdirectory, where ./ is NaMaster's top directory (where setup.py lives) and also copy (or symlink) the files from the include subdirectories from libsharp and healpix and put them in _deps/include. That way the reinstall part is skipped. Another hacky option is to comment out lines 76 and 77 in setup.py and remove "./_deps/lib/libchealpix.a" from line 91 and see if it still compiles.

olivierdore commented 4 years ago

Thank you Javier. I tried all the above already but it did not solve my problem. Either I break the C install or I break the python wrappers!

damonge commented 4 years ago

We've never tested the code on python3.8. I doubt that's the issue, but I'll give it a try.

Can I check if you have:

damonge commented 4 years ago

Travis passes on 3.8, so that's not the problem

olivierdore commented 4 years ago

Hi @damonge . Thanks for the answer. I have libsharp install in my libary as it comes with healpy (all installed through MacPorts) and can not remove it without breaking healpy. Pip (also from MacPorts) does not work for me as it does not find the dependencies (fftw, gsl...) even though they are there (this is true whether I use CC=gcc or clang). That's why I tried to install from the source. I now suspect the issue is that sharp was installed with clang (that's the only option for macports) and I was trying to install namaster with gcc. It does not affect the C build up/check though... so I am retrying to reinstall namaster with clang now but still face some segfault testing the C library (it compiles fine).

olivierdore commented 4 years ago

Hey all, just wanted to give you an update. After reinstalling from scratch, linking to the dependent libraries by hand (ie explicit sim links) to make sure I can control all the dependencies, and deciding to use only clang (although it should not matter), I can run properly the C routines "make check" but I still get a segfault 11 using the python wrapper in (test_workspace_covar_benchmark (test.test_nmt_covar.TestCovarSph). I am running out of ideas so will give up for now and will move to another machine. Thanks for the suggestions. In the process, I discovered that I could not link properly with the macports installed sharp library (installed for healpix/healpy) and I had to reinstall a new version to get it to work properly. Maybe that is the issue after all but why it would affect the python call of this library and not the C call is beyond me. Thanks.