GalacticDynamics-Oxford / Agama

Action-based galaxy modeling framework
Other
73 stars 36 forks source link

Kernel dies when using AGAMA after importing sklearn #39

Open brugalada opened 11 months ago

brugalada commented 11 months ago

I know this is probably totally anecdotal and not useful for anyone but I was so baffled by this that I thought it was worth reporting it.

Apparently, the kernel on my Jupyter Notebook (jupyter-lab) dies whenever I try to run AGAMA functions (with Torus Mapper it works fine; I did not do an exhaustive test of all the functions) if I have imported first sklearn.

This kind of interaction never happened to me before with AGAMA for any package. I just happened to be copy-pasting the import list from another Notebook that had the sklearn import.

I attach a snapshot of the bug and here is the version list of my conda environment (I installed AGAMA like a year ago and have not update it): agama 1.0 pypi_0 pypi aiosignal 1.3.1 pypi_0 pypi anyio 3.5.0 py310hca03da5_0
appnope 0.1.2 py310hca03da5_1001
argon2-cffi 21.3.0 pyhd3eb1b0_0
argon2-cffi-bindings 21.2.0 py310h1a28f6b_0
astropy 5.1 py310h96f19d2_0
astropy-healpix 0.7 py310hf1a086a_2 conda-forge asttokens 2.0.5 pyhd3eb1b0_0
attrs 21.4.0 pyhd3eb1b0_0
babel 2.9.1 pyhd3eb1b0_0
backcall 0.2.0 pyhd3eb1b0_0
beautifulsoup4 4.11.1 py310hca03da5_0
blas 1.0 openblas
bleach 4.1.0 pyhd3eb1b0_0
boost 1.74.0 py310hd0bb7a8_5 conda-forge boost-cpp 1.74.0 h32e41df_4 conda-forge bottleneck 1.3.5 py310h96f19d2_0
brotli 1.0.9 h1a28f6b_7
brotli-bin 1.0.9 h1a28f6b_7
brotlipy 0.7.0 py310h1a28f6b_1002
bzip2 1.0.8 h620ffc9_4
c-ares 1.19.1 hb547adb_0 conda-forge ca-certificates 2023.7.22 hf0a4a13_0 conda-forge cairo 1.16.0 h302bd0f_3
certifi 2023.7.22 pyhd8ed1ab_0 conda-forge cffi 1.15.1 py310h22df2f2_0
cfitsio 4.2.0 h2f961c4_0 conda-forge charset-normalizer 2.0.4 pyhd3eb1b0_0
click 8.1.7 pypi_0 pypi colorama 0.4.6 pyhd8ed1ab_0 conda-forge comm 0.2.0 pypi_0 pypi cryptography 38.0.1 py310h834c97f_0
cycler 0.11.0 pyhd3eb1b0_0
cython 3.0.0 py310h80987f9_0
debugpy 1.5.1 py310hc377ac9_0
decorator 5.1.1 pyhd3eb1b0_0
defusedxml 0.7.1 pyhd3eb1b0_0
dustmaps 1.0.12 py310hbe9552e_1 conda-forge eigen 3.4.0 hc021e02_0 conda-forge entrypoints 0.4 py310hca03da5_0
exceptiongroup 1.1.1 pyhd8ed1ab_0 conda-forge executing 0.8.3 pyhd3eb1b0_0
expat 2.5.0 hb7217d7_1 conda-forge extension-helpers 1.0.0 pyhd8ed1ab_0 conda-forge fftw 3.3.9 h1a28f6b_1
filelock 3.13.1 pypi_0 pypi fontconfig 2.14.2 h82840c6_0 conda-forge fonttools 4.25.0 pyhd3eb1b0_0
freetype 2.12.1 h1192e45_0
frozenlist 1.4.0 pypi_0 pypi gala 1.6.2.dev40+g953cae09 pypi_0 pypi geos 3.11.2 hb7217d7_0 conda-forge gettext 0.21.0 h826f4ad_0
giflib 5.2.1 h1a28f6b_0
glib 2.69.1 h98b2900_1
gmp 6.2.1 hc377ac9_3 anaconda gmpy2 2.1.2 py310h8c48613_0 anaconda graphite2 1.3.14 hc377ac9_1
gst-plugins-base 1.14.1 hf0a386a_0
gstreamer 1.14.1 he09cfb7_0
h5py 3.9.0 py310haafd478_0
harfbuzz 4.3.0 he9eebac_1
hdbscan 0.8.33 py310ha11ecec_1 conda-forge hdf5 1.12.1 nompi_hd9dbc9e_104 conda-forge healpy 1.16.2 py310h25f8736_0 conda-forge icu 68.1 hc377ac9_0
idna 3.4 py310hca03da5_0
iniconfig 2.0.0 pyhd8ed1ab_0 conda-forge ipykernel 6.15.2 py310hca03da5_0
ipython 8.4.0 py310hca03da5_0
ipython_genutils 0.2.0 pyhd3eb1b0_1
ipywidgets 8.1.1 pypi_0 pypi jedi 0.18.1 py310hca03da5_1
jinja2 3.1.2 py310hca03da5_0
joblib 1.1.0 pyhd3eb1b0_0 anaconda jpeg 9e h1a28f6b_0
json5 0.9.6 pyhd3eb1b0_0
jsonschema 4.16.0 py310hca03da5_0
jupyter 1.0.0 py310hca03da5_8
jupyter_client 7.3.5 py310hca03da5_0
jupyter_console 6.4.3 pyhd3eb1b0_0
jupyter_core 4.11.1 py310hca03da5_0
jupyter_server 1.18.1 py310hca03da5_0
jupyterlab 3.5.0 pyhd8ed1ab_0 conda-forge jupyterlab-widgets 3.0.9 pypi_0 pypi jupyterlab_pygments 0.1.2 py_0
jupyterlab_server 2.15.2 py310hca03da5_0
kiwisolver 1.4.2 py310hc377ac9_0
krb5 1.20.1 h69eda48_0 conda-forge lcms2 2.12 hba8e193_0
lerc 3.0 hc377ac9_0
libbrotlicommon 1.0.9 h1a28f6b_7
libbrotlidec 1.0.9 h1a28f6b_7
libbrotlienc 1.0.9 h1a28f6b_7
libclang 12.0.0 default_hc321e17_4
libcurl 8.1.2 h912dcd9_0 conda-forge libcxx 14.0.6 h848a8c0_0
libdeflate 1.8 h1a28f6b_5
libedit 3.1.20210910 h1a28f6b_0
libev 4.33 h642e427_1 conda-forge libexpat 2.5.0 hb7217d7_1 conda-forge libffi 3.4.2 hc377ac9_4
libgfortran 5.0.0 11_3_0_hca03da5_28
libgfortran5 11.3.0 h009349e_28
libiconv 1.16 h1a28f6b_2
libllvm12 12.0.0 h12f7ac0_4
libnghttp2 1.52.0 hae82a92_0 conda-forge libopenblas 0.3.21 h269037a_0
libpng 1.6.37 hb8d0fd4_0
libpq 12.15 h02f6b3c_1
libsodium 1.0.18 h1a28f6b_0
libsqlite 3.42.0 hb31c410_0 conda-forge libssh2 1.11.0 h7a5bd25_0 conda-forge libtiff 4.4.0 had003b8_1
libwebp 1.2.4 h68602c7_0
libwebp-base 1.2.4 h1a28f6b_0
libxml2 2.9.14 h8c5e841_0
libxslt 1.1.35 h9833966_0
libzlib 1.2.13 h53f4e23_5 conda-forge llvm-openmp 14.0.6 hc6e5704_0
lz4-c 1.9.3 hc377ac9_0
markupsafe 2.1.1 py310h1a28f6b_0
matplotlib 3.5.3 py310hca03da5_0
matplotlib-base 3.5.3 py310hc377ac9_0
matplotlib-inline 0.1.6 py310hca03da5_0
matplotlib-venn 0.11.9 pypi_0 pypi mistune 0.8.4 py310h1a28f6b_1000
mpc 1.1.0 h8c48613_1 anaconda mpfr 4.0.2 h695f6f0_1 anaconda mpmath 1.2.1 pypi_0 pypi msgpack 1.0.7 pypi_0 pypi munkres 1.1.4 py_0
nbclassic 0.3.5 pyhd3eb1b0_0
nbclient 0.5.13 py310hca03da5_0
nbconvert 6.4.4 py310hca03da5_0
nbformat 5.5.0 py310hca03da5_0
ncurses 6.4 h7ea286d_0 conda-forge nest-asyncio 1.5.5 py310hca03da5_0
notebook 6.4.12 py310hca03da5_0
nspr 4.33 hc377ac9_0
nss 3.74 h142855e_0
numexpr 2.8.3 py310h5a06f4b_0
numpy 1.23.3 py310h220015d_1
numpy-base 1.23.3 py310h742c864_1
opencv 4.6.0 py310he2359d5_2
openssl 3.1.1 h53f4e23_0 conda-forge packaging 21.3 pyhd3eb1b0_0
pandas 1.4.4 py310hc377ac9_0
pandocfilters 1.5.0 pyhd3eb1b0_0
parso 0.8.3 pyhd3eb1b0_0
pcre 8.45 hc377ac9_0
pexpect 4.8.0 pyhd3eb1b0_3
pickleshare 0.7.5 pyhd3eb1b0_1003
pillow 9.2.0 py310h4d1bdd5_1
pip 22.2.2 py310hca03da5_0
pixman 0.40.0 h27ca646_0 conda-forge pluggy 1.0.0 pyhd8ed1ab_5 conda-forge ply 3.11 py310hca03da5_0
progressbar2 4.2.0 pyhd8ed1ab_0 conda-forge prometheus_client 0.14.1 py310hca03da5_0
prompt-toolkit 3.0.20 pyhd3eb1b0_0
prompt_toolkit 3.0.20 hd3eb1b0_0
protobuf 4.25.1 pypi_0 pypi psutil 5.9.0 py310h1a28f6b_0
psycopg2 2.8.6 py310hf27765b_1 anaconda ptyprocess 0.7.0 pyhd3eb1b0_2
pure_eval 0.2.2 pyhd3eb1b0_0
pycparser 2.21 pyhd3eb1b0_0
pyerfa 2.0.0 py310h1a28f6b_0
pygments 2.11.2 pyhd3eb1b0_0
pyopenssl 22.0.0 pyhd3eb1b0_0
pyparsing 3.0.9 py310hca03da5_0
pyqt 5.15.7 py310hc377ac9_0
pyqt5-sip 12.11.0 pypi_0 pypi pyrsistent 0.18.0 py310h1a28f6b_0
pysocks 1.7.1 py310hca03da5_0
pytest 7.4.0 pyhd8ed1ab_0 conda-forge pytest-runner 6.0.0 pyhd8ed1ab_0 conda-forge python 3.10.12 h01493a6_0_cpython conda-forge python-dateutil 2.8.2 pyhd3eb1b0_0
python-fastjsonschema 2.16.2 py310hca03da5_0
python-utils 3.8.1 pyhd8ed1ab_0 conda-forge python_abi 3.10 3_cp310 conda-forge pytz 2022.1 py310hca03da5_0
pywavelets 1.3.0 py310h1a28f6b_0
pyyaml 6.0 pypi_0 pypi pyzmq 23.2.0 py310hc377ac9_0
qt-main 5.15.2 ha2d02b5_7
qt-webengine 5.15.9 h2903aaf_4
qtconsole 5.3.2 py310hca03da5_0
qtpy 2.2.0 py310hca03da5_0
qtwebkit 5.212 h0f11f3c_4
ray 2.8.0 pypi_0 pypi readline 8.2 h1a28f6b_0
requests 2.28.1 py310hca03da5_0
scikit-learn 1.0.2 py310h59830a0_1 anaconda scipy 1.9.3 py310h20cbe94_0
send2trash 1.8.0 pyhd3eb1b0_1
setuptools 65.5.0 py310hca03da5_0
shapely 2.0.1 py310h605c0e7_1 conda-forge sip 6.6.2 py310hc377ac9_0
six 1.16.0 pyhd3eb1b0_1
sniffio 1.2.0 py310hca03da5_1
soupsieve 2.3.2.post1 py310hca03da5_0
sqlite 3.39.3 h1058600_0
stack_data 0.2.0 pyhd3eb1b0_0
sympy 1.10.1 py310hca03da5_0 anaconda terminado 0.13.1 py310hca03da5_0
testpath 0.6.0 py310hca03da5_0
threadpoolctl 2.2.0 pyh0d69192_0 anaconda tk 8.6.12 hb8d0fd4_0
toml 0.10.2 pyhd3eb1b0_0
tomli 2.0.1 pyhd8ed1ab_0 conda-forge tornado 6.2 py310h1a28f6b_0
traitlets 5.1.1 pyhd3eb1b0_0
typing-extensions 4.3.0 py310hca03da5_0
typing_extensions 4.3.0 py310hca03da5_0
tzdata 2022f h04d1e81_0
urllib3 1.26.12 py310hca03da5_0
wcwidth 0.2.5 pyhd3eb1b0_0
webencodings 0.5.1 py310hca03da5_1
websocket-client 0.58.0 py310hca03da5_4
wheel 0.37.1 pyhd3eb1b0_0
widgetsnbextension 4.0.9 pypi_0 pypi xz 5.2.6 h1a28f6b_0
yaml 0.2.5 h1a28f6b_0
zeromq 4.3.4 hc377ac9_0
zlib 1.2.13 h53f4e23_5 conda-forge zstd 1.5.2 h8574219_0

Screenshot 2023-12-14 at 11 05 15

eugvas commented 11 months ago

this might be related to another issue that was reported recently (by email, not as a ticket), this time related to PyNBody. The error message reads

OMP: Error #15: Initializing libomp.dylib, but found libomp.dylib already initialized.
OMP: Hint This means that multiple copies of the OpenMP runtime have been linked into the program. That is dangerous, since it can degrade performance or cause incorrect results. The best thing to do is to ensure that only a single OpenMP runtime is linked into the process, e.g. by avoiding static linking of the OpenMP runtime in any library. As an unsafe, unsupported, undocumented workaround you can set the environment variable KMP_DUPLICATE_LIB_OK=TRUE to allow the program to continue to execute, but that may cause crashes or silently produce incorrect results. For more information, please see http://openmp.llvm.org/
Fatal Python error: Aborted

followed by a segfault.

First, can you run the same import commands (agama and sklearn) in a script rather than a notebook, to see if there are any additional messages printed out?

If the issue is indeed related to the OpenMP version clash, will need to do something about it.. perhaps even adding these lines at the beginning of the script might help: import os os.environ['KMP_DUPLICATE_LIB_OK']='True'

There are other possibilities, but let's first find out if this related to OpenMP at all.

brugalada commented 8 months ago

Sorry for taking so long to reply.

Here is the first test, doing the same from the console.

image

Apparently, it gets a segmentation fault and kicks me out of python. Using os.environ['KMP_DUPLICATE_LIB_OK']='True' yields exactly the same result.

For some reason, when I enter python again and try to scroll up to the previously used calls, I can see all of them, like import agama orimport sklearn, and even older, but none of the calls to agama except setUnits.

I tried again with another potential and another function call and apparently the problem is the integration of orbits:

image

After trying a couple other functions, those related to the "potential" object seem to be safe. But calling, for instance, agama.ActionFinder makes it crash.

Also, if I import AGAMA first and then sklearn... no problem at all :/

Does this help?

eugvas commented 8 months ago

thanks for the investigation! I suppose that only the operations that invoke OpenMP cause segfaults. Here is a non-exhaustive list of such operations:

At the moment, I cannot reproduce this situation on my side, but will try to get back to it in the near future. At least, it seems that you found a workaround by swapping the order of imports :)

yy1021805450 commented 1 month ago

I encountered the same error after importing dustmaps' SFDQuery when computing potential's projected density over a 50*50 grid. However, removing dustmaps or using a much smaller grid like two or three points runs well. So I guess maybe it's a memory problem? System is MacOS M3 chip, Python 3.12. Really weird.

eugvas commented 1 month ago

yep, it must be the same conflict of OpenMP libraries: it seems that in some cases (but not always), two different versions of libomp.so end up being loaded by the same script, and BANG!!! I should really sort it out, but to help me reproduce the problem, can you tell me your configuration? specifically, are you using Python from Anaconda, and installing other packages (numpy, dustmaps, etc.) using pip? what are the various flags in your Makefile.local?

Evaluating the potential, density, actions, etc., is done in parallel when the number of points exceeds some threshold (for projected density, it is currently set at 64, but I might adjust it in the future after running some performance tests), so you don't encounter a problem for a small number of input points.