ml-explore / mlx

MLX: An array framework for Apple silicon
https://ml-explore.github.io/mlx/
MIT License
16.69k stars 957 forks source link

[BUG] Segmentation fault while running custom operations #1267

Closed vinayhpandya closed 2 months ago

vinayhpandya commented 2 months ago

Describe the bug Segmentation fault issue when running custom operations while following the tutorial

To Reproduce python3 setup.py build_ext --inplace -- output from the command

running build_ext
-- The CXX compiler identification is AppleClang 15.0.0.15000309
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found MLX: /usr/local/lib/libmlx.a
-- Found Python: /opt/homebrew/Frameworks/Python.framework/Versions/3.11/bin/python3.11 (found suitable version "3.11.4", minimum required is "3.8") found components: Interpreter Development.Module
-- Configuring done (1.2s)
-- Generating done (0.0s)
-- Build files have been written to: /Users/vinaypandya/temp_project/mlx/examples/extensions/build/temp.macosx-12-arm64-cpython-311/mlx_sample_extensions._ext
[  5%] Building mlx_ext.metallib
[  5%] Built target mlx_ext_metallib
[ 11%] Building CXX object CMakeFiles/mlx_ext.dir/axpby/axpby.cpp.o
[ 17%] Linking CXX shared library /Users/vinaypandya/temp_project/mlx/examples/extensions/build/lib.macosx-12-arm64-cpython-311/mlx_sample_extensions/libmlx_ext.dylib
ld: warning: search path '/Applications/Xcode-15.2.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.2.sdk/System/Library/Frameworks' not found
[ 17%] Built target mlx_ext
[ 23%] Building CXX object CMakeFiles/nanobind-static.dir/opt/homebrew/lib/python3.11/site-packages/nanobind/src/nb_internals.cpp.o
[ 29%] Building CXX object CMakeFiles/nanobind-static.dir/opt/homebrew/lib/python3.11/site-packages/nanobind/src/nb_func.cpp.o
[ 35%] Building CXX object CMakeFiles/nanobind-static.dir/opt/homebrew/lib/python3.11/site-packages/nanobind/src/nb_type.cpp.o
[ 41%] Building CXX object CMakeFiles/nanobind-static.dir/opt/homebrew/lib/python3.11/site-packages/nanobind/src/nb_enum.cpp.o
[ 47%] Building CXX object CMakeFiles/nanobind-static.dir/opt/homebrew/lib/python3.11/site-packages/nanobind/src/nb_ndarray.cpp.o
[ 52%] Building CXX object CMakeFiles/nanobind-static.dir/opt/homebrew/lib/python3.11/site-packages/nanobind/src/nb_static_property.cpp.o
[ 58%] Building CXX object CMakeFiles/nanobind-static.dir/opt/homebrew/lib/python3.11/site-packages/nanobind/src/common.cpp.o
[ 64%] Building CXX object CMakeFiles/nanobind-static.dir/opt/homebrew/lib/python3.11/site-packages/nanobind/src/error.cpp.o
[ 70%] Building CXX object CMakeFiles/nanobind-static.dir/opt/homebrew/lib/python3.11/site-packages/nanobind/src/trampoline.cpp.o
[ 76%] Building CXX object CMakeFiles/nanobind-static.dir/opt/homebrew/lib/python3.11/site-packages/nanobind/src/implicit.cpp.o
[ 82%] Linking CXX static library libnanobind-static.a
[ 82%] Built target nanobind-static
[ 88%] Building CXX object CMakeFiles/_ext.dir/bindings.cpp.o
[ 94%] Building CXX object CMakeFiles/_ext.dir/axpby/axpby.cpp.o
[100%] Linking CXX shared module /Users/vinaypandya/temp_project/mlx/examples/extensions/build/lib.macosx-12-arm64-cpython-311/mlx_sample_extensions/_ext.cpython-311-darwin.so
ld: warning: search path '/Applications/Xcode-15.2.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.2.sdk/System/Library/Frameworks' not found
[100%] Built target _ext
copying build/lib.macosx-12-arm64-cpython-311/mlx_sample_extensions/_ext.cpython-311-darwin.so -> mlx_sample_extensions
copying /Users/vinaypandya/temp_project/mlx/examples/extensions/build/lib.macosx-12-arm64-cpython-311/mlx_sample_extensions/libmlx_ext.dylib -> /Users/vinaypandya/temp_project/mlx/examples/extensions/mlx_sample_extensions
copying /Users/vinaypandya/temp_project/mlx/examples/extensions/build/lib.macosx-12-arm64-cpython-311/mlx_sample_extensions/mlx_ext.metallib -> /Users/vinaypandya/temp_project/mlx/examples/extensions/mlx_sample_extensions

Include code snippet

import mlx.core as mx
from mlx_sample_extensions import axpby

a = mx.ones((3, 4))
b = mx.ones((3, 4))
c = axpby(a, b, 4.0, 2.0, stream=mx.cpu)

print(f"c shape: {c.shape}")
print(f"c dtype: {c.dtype}")
print(f"c correct: {mx.all(c == 6.0).item()}")

Expected behavior No segmentation fault and the operation alpha x + beta y should succeed and result be stored in new array

Desktop (please complete the following information):

Additional context This issue persists even after cleaning and uninstalling mlx Running debugger lldb results in linking problem which is difficult to backtrace

* thread #2, stop reason = exec
    frame #0: 0x0000000100014b70 dyld`_dyld_start
awni commented 2 months ago

I'm not getting a segfault with that. Which version of MLX are you using? Also maybe good to make sure you have the right version of nanobind installed.

pip install --force-reinstall  nanobind@git+https://github.com/wjakob/nanobind.git@2f04eac452a6d9142dedb957701bdb20125561e4
awni commented 2 months ago

You are running the example in mlx/examples/extensions without any modification right?

vinayhpandya commented 2 months ago

I am using the latest version of mlx. I tried to build it from source and was stuck so used setup.py instead to build the package. Correct, I did not make any modifications to mlx/examples/extensions package. I was able to solve the segmentation fault after uninstalling and re-installing mlx from the source, I believe the xcode settings which I used were not compatible as the error suggests that there was an issue in linking the library itself I am able to build this now and test the extension.

Here's what I did to resolve this

  1. xcode-select --install and then select active developer directory sudo xcode-select --switch /Applications/Xcode.app/Contents/Developer
  2. Rebuild the package using cpp -> mkdir -p build && cd build and cmake .. && make -j.
  3. Install the modules using make install
  4. Then going to the extensions folder and rerunning python3 setup.py build_ext --inplace
awni commented 2 months ago

Sounds good.. I'm not sure what went wrong there. Maybe you were using an older xcode/sdk version which we do not support 🤔 ..