Closed ramcdougal closed 1 month ago
I assume at this late date that everything is arm64 or both arm64 and x86_64 To verify
lipo -archs `which python3`
lipo -archs /Users/ramcdougal/anaconda3/lib/libpython3.10.dylib
lipo -archs `which nrniv`
lipo -archs /Users/ramcdougal/lib/libnrniv.dylib
I don't have this machine with me at the moment (and will verify later), but in general the Mac will complain if you try to use an x86_64 library with an arm Python.
In any case, the first few lines of NEURON successfully run; it's just when you try to grab the GIL that it segfaults.
You're right about the arm64 vs x86_64 issue being a red herring. My only other idea is similar. It's clear that
-- python3.10 (default)
-- EXE | /Users/ramcdougal/anaconda3/bin/python3
-- INC | /Users/ramcdougal/anaconda3/include/python3.10
-- LIB | /Users/ramcdougal/anaconda3/lib/libpython3.10.dylib
But just for fun, can you rebuild with an explicit -DPYTHON_EXECUTABLE=which python3
For the record, the lipo
commands all reported arm64
and nothing changed with the explicit specification of which Python.
I also get the segfault on my M1 after installing anaconda3. I configured with
build % cmake .. -G Ninja -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -DCMAKE_INSTALL_PREFIX=install -DNRN_ENABLE_TESTS=ON
and the python is
-- python3.10 (default)
-- EXE | /Users/hines/anaconda3/bin/python3
-- INC | /Users/hines/anaconda3/include/python3.10
-- LIB | /Users/hines/anaconda3/lib/libpython3.10.dylib
My first build attempt resulted in
FAILED: src/nrnpython/CMakeFiles/hoc_module.util
...
INFO:root:setup.py called with:setup.py build --cmake-build-dir /Users/hines/neuron/anacon/build --rx3d-opt-level 0 --without-nrnpython --build-lib=/Users/hines/neuron/anacon/build/lib/python build_ext --define=USE_PYTHON,NRN_ENABLE_THREADS
ERROR:root:ERROR: RX3D wheel requires Cython and numpy. Please install beforehand
Though import numpy
works and
% which cython
/Library/Frameworks/Python.framework/Versions/3.11/bin/cython
I chose for the moment to use -DNRN_ENABLE_RX3D=OFF
, ad the build succeeded. Then
% python3
Python 3.10.9 (main, Mar 1 2023, 12:20:14) [Clang 14.0.6 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from neuron import h
zsh: segmentation fault python3
Rebuilding with -DNRN_ENABLE_PYTHON_DYNAMIC=ON
seems to work around the issue
% python3
Python 3.10.9 (main, Mar 1 2023, 12:20:14) [Clang 14.0.6 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from neuron import h
>>> h.nrnversion(6)
"cmake option default differences: 'NRN_ENABLE_RX3D=OFF' 'NRN_ENABLE_TESTS=ON' 'NRN_ENABLE_PYTHON_DYNAMIC=ON' 'NRN_LINK_AGAINST_PYTHON=OFF' 'CMAKE_INSTALL_PREFIX=/Users/hines/neuron/anacon/build/install' 'CMAKE_C_COMPILER=/usr/bin/clang' 'CMAKE_CXX_COMPILER=/usr/bin/clang++' 'PYTHON_EXECUTABLE=/Users/hines/anaconda3/bin/python3'"
>>>
At the moment, I have no idea why the build time linkage to anaconda3 python3.10 exhibits the segfault. I happen to have a python.org installation of python3.10 on this machine. Building and linking against that one (which allows RX3D ON) does work.
I think the anaconda build attempt with rx3d failed because it was finding the system framework cython
. That could probably be fixed by conda install cython
in the activated anaconda environment.
... Doesn't help with the segfault issue though.
With respect to build time linkage to python3.10, the only (relevant?) difference I see between the anaconda build and the python.org build is
python.org
build2 % otool -L lib/libnrniv.dylib
...
/Library/Frameworks/Python.framework/Versions/3.10/Python (compatibility version 3.10.0, current version 3.10.0)
anaconda3
build % otool -L lib/libnrniv.dylib
...
@rpath/libpython3.10.dylib (compatibility version 3.10.0, current version 3.10.0)
I suppose I could try installing the python.org version of python3.10.9 but I don't see how that would help me understand the reason for the segfault.
This is highly speculative, but notice that anaconda3 python3.10 does not link to libpython
build % otool -L `which python3`
/Users/hines/anaconda3/bin/python3:
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1292.60.1)
Does the dynamic loader know that everything it is looking for is in in python3.10.9 and it shouldn't load @rpath/libpython3.10.dylib ?
I'm wondering if the -DNRN_ENABLE_PYTHON_DYNAMIC=ON
work around is sufficient to close this issue?
It addresses my immediate problem (thanks), but given that build time linkage is (1) the default and (2) supposed to work, I'd argue the issue should stay open until resolved.
(EDIT: I tested the -DNRN_ENABLE_PYTHON_DYNAMIC=ON
fix; that worked on my machine too. Thanks.)
With respect to (1), should dynamic be the default? Switching to 9.0 would be the time to make a change like that.
There may be something to my speculation. I copied the link line for libnrniv.dylib from ninja -j 1 -v >& temp
and modified temp into a 372 line bash script with the single command
#!/bin/sh
set -ex
/usr/bin/clang++ -g -O2 -arch arm64 -isysroot \
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX13.3.sdk \
-dynamiclib -Wl,-headerpad_max_install_names -undefined dynamic_lookup \
-o lib/libnrniv.dylib -install_name @rpath/libnrniv.dylib \
src/nrniv/CMakeFiles/nrniv_lib.dir/__/ivoc/apwindow.cpp.o \
...
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX13.3.sdk/usr/lib/libform.tbd \
\
/opt/homebrew/Cellar/open-mpi/4.1.5/lib/libmpi.dylib lib/libinterviews.a \
/opt/homebrew/lib/libX11.dylib /opt/homebrew/lib/libXext.dylib \
#/Users/hines/anaconda3/lib/libpython3.10.dylib \
Then otool -L lib/libnrniv.dylib
does not mention libpython3.10.dylib and I copy the library to it's install location.
That eliminates the segfault.
Note that install/lib/python/neuron/hoc.cpython-310-darwin.so
remains unchanged but never mentioned:
build % otool -L install/lib/python/neuron/hoc.cpython-310-darwin.so
install/lib/python/neuron/hoc.cpython-310-darwin.so:
@rpath/libnrniv.dylib (compatibility version 0.0.0, current version 0.0.0)
@rpath/libc++.1.dylib (compatibility version 1.0.0, current version 1.0.0)
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1319.100.3)
@nrnhines This isn't an M1/M2 thing. We just ran into the same issue with Anaconda Python (3.8 and 3.10) on Intel macs.
I looked into this a week ago but didn't get time to write the summary.
Michael already mentioned that dynamic Python works, but I wasn't sure of the root cause. I tried various things and spent time on the false leads (like this). I would say this 5five-year-old post for VTK summaries the issue quite well:
... Recently conda linked python3 statically, so all python symbols are included in the executable instead of being brought in by libpython. This created a problem with VTK used from python, because VTK links with libpython (it uses matplotlib for math text). So, you had python code brought in by the pyhton executable and by libpyton which resulted in a segfault for tests that used python and math text.
I tested this on the Anaconda linux distribution, and the issue doesn't appear.
I will create a PR with a small change so that CMake can check if we are using Anaconda Python on MacOS and then disable linking libpython. Given that the issue appears only on Mac and with Anaconda, I think this is sufficient.
Context
Overview of the issue
I removed prior installs of NEURON, then:
The
~/lib/python
folder is on my PYTHONPATH and~/bin
is on my PATH.Attempting to run from Python fails with a segfault:
but running
nrniv -python
seems to work:Note: the crash occurs in the line
PyLockGIL lock;
at the beginning ofnrnpy_hoc()
innrnpy_hoc.cpp
. See the lldb trace below.Expected result/behavior
Importing neuron from Python should work. (To be clear, the segfault occurs even on an
import neuron
.)NEURON setup
Minimal working example - MWE
MWE that can be used for reproducing the issue and testing. A couple of examples:
lldb session
cmake session
For completeness, here's the
cmake
session: