Open iBims1JFK opened 5 days ago
Yes. I briefly looked into this for the conda package, where the tests for the Python bindings failed due to this crash. However since I do not have a macOS system, I cannot really debug this issue.
If I should provide more system information please let me know.
I guess it would help to get a backtrace. That would mean that you have to compile the project and the Python bindings with Debug symbols, then run the Python interpreter in gdb or similar, reproduce this crash, and finally print the backtrace and paste it here.
I am happy to help at the debugging process. Unfortunately I am not experienced in that field so I fear that you need to walk me a bit through this. What I did is compiling the library with the following command:
cmake -B build -DCMAKE_BUILD_TYPE=Debug \
-DBUILD_SHARED_LIBS=ON \
-DBUILD_PYTHON_WRAPPER=ON \
-DPython3_EXECUTABLE=$(which python3) \
-DPython3_INCLUDE_DIR=/opt/homebrew/Caskroom/mambaforge/base/envs/ros_env/include/python3.11 \
-DPython3_LIBRARY=/opt/homebrew/Caskroom/mambaforge/base/envs/ros_env/lib/libpython3.11.dylib \
-DPython3_NUMPY_INCLUDE_DIR=/opt/homebrew/Caskroom/mambaforge/base/envs/ros_env/lib/python3.11/site-packages/numpy/core/include
-- The C compiler identification is Clang 16.0.6
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /opt/homebrew/Caskroom/mambaforge/base/envs/ros_env/bin/arm64-apple-darwin20.0.0-clang - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE
-- The CXX compiler identification is Clang 16.0.6
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /opt/homebrew/Caskroom/mambaforge/base/envs/ros_env/bin/arm64-apple-darwin20.0.0-clang++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Configuring done (1.4s)
-- Generating done (0.0s)
CMake Warning:
Manually-specified variables were not used by the project:
Python3_NUMPY_INCLUDE_DIR
-- Build files have been written to: /Users/jonathan/Documents/master-thesis/apriltag-test/apriltag/build
gdb is not available for apple silicon so I used lldb, if there is a better alternative, please let me know.
lldb /opt/homebrew/Caskroom/mambaforge/base/envs/ros_env/bin/python
(lldb) target create "/opt/homebrew/Caskroom/mambaforge/base/envs/ros_env/bin/python"
Current executable set to '/opt/homebrew/Caskroom/mambaforge/base/envs/ros_env/bin/python' (arm64).
(lldb) run test.py
Process 15532 launched: '/opt/homebrew/Caskroom/mambaforge/base/envs/ros_env/bin/python' (arm64)
Process 15532 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x10)
frame #0: 0x0000000101367290 libpython3.11.dylib`type_ready + 92
libpython3.11.dylib`type_ready:
-> 0x101367290 <+92>: ldr x8, [x8, #0x10]
0x101367294 <+96>: ldr w9, [x8, #0x1428]
0x101367298 <+100>: cbz w9, 0x101367338 ; <+260>
0x10136729c <+104>: sub w9, w9, #0x1
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x10)
* frame #0: 0x0000000101367290 libpython3.11.dylib`type_ready + 92
frame #1: 0x0000000101377de8 libpython3.11.dylib`PyType_Ready + 52
frame #2: 0x00000001008c1864 apriltag.cpython-311-darwin.so`PyInit_apriltag + 32
frame #3: 0x000000010019bbc0 python`_imp_create_dynamic + 1188
frame #4: 0x00000001000b8e20 python`cfunction_vectorcall_FASTCALL + 256
frame #5: 0x0000000100164b6c python`_PyEval_EvalFrameDefault + 55160
frame #6: 0x0000000100166e18 python`_PyEval_Vector + 184
frame #7: 0x000000010006286c python`object_vacall + 316
frame #8: 0x0000000100062668 python`PyObject_CallMethodObjArgs + 108
frame #9: 0x0000000100196c04 python`PyImport_ImportModuleLevelObject + 1580
frame #10: 0x000000010015f0d0 python`_PyEval_EvalFrameDefault + 31964
frame #11: 0x0000000100156424 python`PyEval_EvalCode + 220
frame #12: 0x00000001001bc3f8 python`run_mod + 144
frame #13: 0x00000001001bbe58 python`_PyRun_SimpleFileObject + 1260
frame #14: 0x00000001001baf18 python`_PyRun_AnyFileObject + 240
frame #15: 0x00000001001e192c python`Py_RunMain + 3100
frame #16: 0x00000001001e2784 python`pymain_main + 1252
frame #17: 0x0000000100003684 python`main + 56
frame #18: 0x0000000182b6f154 dyld`start + 2476
I hope that this helps.
Building with -DCMAKE_BUILD_TYPE=Debug
is the right start. But your backtrace does not show where in apriltag.cpython-311-darwin.so
it crashes. Can you print the source code lines?
If I force a crash via printf("NULL: %d\n", *(int*)NULL);
(NULL-pointer dereference) in PyInit_apriltag(void)
and import the module via python3 -c "import apriltag"
I can reproduce a crash and backtrace that with gdb:
gdb -ex run --args python3 -c "import apriltag"
which will give:
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7fadb3f in PyInit_apriltag () at [...]/apriltag/apriltag_pywrap.c:380
380 printf("NULL: %d\n", *(int*)NULL);
(gdb) bt
#0 0x00007ffff7fadb3f in PyInit_apriltag () at [...]/apriltag/apriltag_pywrap.c:380
#1 0x00000000006a9881 in _PyImport_LoadDynamicModuleWithSpec (spec=0x7ffff74349e0, fp=<optimized out>) at ../Python/importdl.c:169
#2 0x00000000006a8fd2 in _imp_create_dynamic_impl (module=<optimized out>, file=0x0, spec=0x7ffff74349e0) at ../Python/import.c:3775
#3 _imp_create_dynamic (module=<optimized out>, args=<optimized out>, nargs=<optimized out>) at ../Python/clinic/import.c.h:506
#4 0x0000000000582067 in cfunction_vectorcall_FASTCALL (func=0x7ffff75972e0, args=0x7ffff75fc928, nargsf=<optimized out>, kwnames=<optimized out>)
at ../Include/cpython/methodobject.h:50
#5 0x00000000005db336 in _PyEval_EvalFrameDefault (tstate=<optimized out>, frame=<optimized out>, throwflag=<optimized out>) at Python/bytecodes.c:3254
#6 0x0000000000549ae7 in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=2, args=0x7fffffffd500, callable=0x7ffff75a4040, tstate=0xba6048 <_PyRuntime+459656>)
at ../Include/internal/pycore_call.h:92
#7 object_vacall (tstate=tstate@entry=0xba6048 <_PyRuntime+459656>, base=<optimized out>, callable=0x7ffff75a4040, vargs=0x7fffffffd588) at ../Objects/call.c:850
#8 0x000000000054b373 in PyObject_CallMethodObjArgs (obj=<optimized out>, name=<optimized out>) at ../Objects/call.c:911
#9 0x00000000005fda35 in import_find_and_load (abs_name=0x7ffff743de30, tstate=0xba6048 <_PyRuntime+459656>) at ../Python/import.c:2779
#10 PyImport_ImportModuleLevelObject (name=name@entry=0x7ffff743de30, globals=<optimized out>, locals=locals@entry=0x7ffff75f9e80, fromlist=fromlist@entry=0xa408a0 <_Py_NoneStruct>,
level=0) at ../Python/import.c:2862
#11 0x00000000005dc40f in import_name (level=0xb36988 <_PyRuntime+3272>, fromlist=0xa408a0 <_Py_NoneStruct>, name=0x7ffff743de30, frame=<optimized out>, tstate=<optimized out>)
at ../Python/ceval.c:2482
#12 _PyEval_EvalFrameDefault (tstate=tstate@entry=0xba6048 <_PyRuntime+459656>, frame=<optimized out>, frame@entry=0x7ffff7fb2020, throwflag=throwflag@entry=0)
at Python/bytecodes.c:2135
#13 0x00000000005d560b in _PyEval_EvalFrame (throwflag=0, frame=0x7ffff7fb2020, tstate=0xba6048 <_PyRuntime+459656>) at ../Include/internal/pycore_ceval.h:89
#14 _PyEval_Vector (kwnames=0x0, argcount=0, args=0x0, locals=0x7ffff75f9e80, func=0x7ffff744d3a0, tstate=0xba6048 <_PyRuntime+459656>) at ../Python/ceval.c:1683
#15 PyEval_EvalCode (co=co@entry=0x7ffff748cfa0, globals=globals@entry=0x7ffff75f9e80, locals=locals@entry=0x7ffff75f9e80) at ../Python/ceval.c:578
#16 0x00000000006086f3 in run_eval_code_obj (locals=0x7ffff75f9e80, globals=0x7ffff75f9e80, co=0x7ffff748cfa0, tstate=0xba6048 <_PyRuntime+459656>) at ../Python/pythonrun.c:1722
#17 run_mod (arena=0x7ffff751be50, flags=0x7ffff751be50, locals=0x7ffff75f9e80, globals=0x7ffff75f9e80, filename=<optimized out>, mod=<optimized out>) at ../Python/pythonrun.c:1743
#18 PyRun_StringFlags (str=str@entry=0x7ffff75fa050 "import apriltag\n", start=start@entry=257, globals=0x7ffff75f9e80, locals=0x7ffff75f9e80, flags=flags@entry=0x7fffffffd9c0)
at ../Python/pythonrun.c:1618
#19 0x00000000006b40ee in PyRun_SimpleStringFlags (command=0x7ffff75fa050 "import apriltag\n", flags=flags@entry=0x7fffffffd9c0) at ../Python/pythonrun.c:480
#20 0x00000000006bce01 in pymain_run_command (command=<optimized out>) at ../Modules/main.c:255
#21 pymain_run_python (exitcode=0x7fffffffd98c) at ../Modules/main.c:620
#22 Py_RunMain () at ../Modules/main.c:709
#23 0x00000000006bc81d in Py_BytesMain (argc=<optimized out>, argv=<optimized out>) at ../Modules/main.c:763
#24 0x00007ffff7c2a1ca in __libc_start_call_main (main=main@entry=0x518880 <main>, argc=argc@entry=3, argv=argv@entry=0x7fffffffdbd8) at ../sysdeps/nptl/libc_start_call_main.h:58
#25 0x00007ffff7c2a28b in __libc_start_main_impl (main=0x518880 <main>, argc=3, argv=0x7fffffffdbd8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>,
stack_end=0x7fffffffdbc8) at ../csu/libc-start.c:360
#26 0x0000000000657ca5 in _start ()
Line 380 is then exactly the line where I added the NULL-pointer dereference.
The line libpython3.11.dylib`PyType_Ready + 52
in your backtrace suggests that this is caused by PyType_Ready(&apriltagType)
.
Since you do not seem to be able to use a debugger for now to debug this, could you simply add a print statement like such:
if (PyType_Ready(&apriltagType) < 0) {
printf("PyType_Ready error!\n"); fflush(stdout);
return NULL;
}
and check if it prints on the screen?
I added a test for python3 -c "import apriltag; apriltag.apriltag(family='tag36h11')"
to the CI (https://github.com/AprilRobotics/apriltag/pull/353). This runs without crashes on a macos-14-arm64
runner: https://github.com/AprilRobotics/apriltag/actions/runs/10873853033/job/30170448778?pr=353.
This might just be related to a weird Python setup with a mixup of versions from different sources. Your initial report shows that you are using Python 3.12.6
but later you compile against libpython3.11.dylib
. This very likely causes errors. I also see that you use homebrew for Python. Can you run this again with a standard Python installation outside of homebrew etc.?
I recompiled it and judging from apriltag.cpython-311-darwin.so PyInit_apriltag at apriltag_pywrap.c:375:9 [opt]
the symbols work (better?) now. And I think expectedly it did not print anything.
lldb /opt/homebrew/Caskroom/mambaforge/base/envs/ros_env/bin/python
(lldb) target create "/opt/homebrew/Caskroom/mambaforge/base/envs/ros_env/bin/python"
Current executable set to '/opt/homebrew/Caskroom/mambaforge/base/envs/ros_env/bin/python' (arm64).
(lldb) run test.py
Process 91701 launched: '/opt/homebrew/Caskroom/mambaforge/base/envs/ros_env/bin/python' (arm64)
Process 91701 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x10)
frame #0: 0x000000010137f290 libpython3.11.dylib`type_ready + 92
libpython3.11.dylib`type_ready:
-> 0x10137f290 <+92>: ldr x8, [x8, #0x10]
0x10137f294 <+96>: ldr w9, [x8, #0x1428]
0x10137f298 <+100>: cbz w9, 0x10137f338 ; <+260>
0x10137f29c <+104>: sub w9, w9, #0x1
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x10)
* frame #0: 0x000000010137f290 libpython3.11.dylib`type_ready + 92
frame #1: 0x000000010138fde8 libpython3.11.dylib`PyType_Ready + 52
frame #2: 0x00000001008c1790 apriltag.cpython-311-darwin.so`PyInit_apriltag at apriltag_pywrap.c:375:9 [opt]
frame #3: 0x000000010019bbc0 python`_imp_create_dynamic + 1188
frame #4: 0x00000001000b8e20 python`cfunction_vectorcall_FASTCALL + 256
frame #5: 0x0000000100164b6c python`_PyEval_EvalFrameDefault + 55160
frame #6: 0x0000000100166e18 python`_PyEval_Vector + 184
frame #7: 0x000000010006286c python`object_vacall + 316
frame #8: 0x0000000100062668 python`PyObject_CallMethodObjArgs + 108
frame #9: 0x0000000100196c04 python`PyImport_ImportModuleLevelObject + 1580
frame #10: 0x000000010015f0d0 python`_PyEval_EvalFrameDefault + 31964
frame #11: 0x0000000100156424 python`PyEval_EvalCode + 220
frame #12: 0x00000001001bc3f8 python`run_mod + 144
frame #13: 0x00000001001bbe58 python`_PyRun_SimpleFileObject + 1260
frame #14: 0x00000001001baf18 python`_PyRun_AnyFileObject + 240
frame #15: 0x00000001001e192c python`Py_RunMain + 3100
frame #16: 0x00000001001e2784 python`pymain_main + 1252
frame #17: 0x0000000100003684 python`main + 56
frame #18: 0x0000000182b6f154 dyld`start + 2476
The different Python version where used in different environments. But always compiled against the correct version of the environment. I will definitely check with the standard version although I need it to work in the conda environment.
I recompiled it and judging from
apriltag.cpython-311-darwin.so PyInit_apriltag at apriltag_pywrap.c:375:9 [opt]
the symbols work (better?) now. And I think expectedly it did not print anything.
Did you add a print statement?
At least now you get the line numbers. apriltag.cpython-311-darwin.so`PyInit_apriltag at apriltag_pywrap.c:375:9
tells us that this is probably related to return NULL;
. That is why I asked to print something before the return NULL;
. I suspect hat, once you add the print statements, you will see that it goes into this branch because PyType_Ready(&apriltagType) < 0
.
Interestingly, it does work like a charm when I am building outside of the conda environment but still with the brew python. I added this line as you told me
if (PyType_Ready(&apriltagType) < 0)
{
printf("PyType_Ready error!\n");
fflush(stdout);
return NULL;
}
But I think it crashes at if (PyType_Ready(&apriltagType) < 0)
which is line 375 and therefore cannot print anything.
Interestingly, it does work like a charm when I am building outside of the conda environment but still with the brew python.
That sounds definitely like a mixup of environments.
But I think it crashes at
if (PyType_Ready(&apriltagType) < 0)
which is line 375 and therefore cannot print anything.
Which version of the code are you on? On the current master, line 375 is return NULL;
:
https://github.com/AprilRobotics/apriltag/blob/786ad11fa812524f33ad8375a5f157b7e57b730d/apriltag_pywrap.c#L374-L375
If this line appear in the backtrace, a print before this, e.g.:
if (PyType_Ready(&apriltagType) < 0) { // 374
printf("PyType_Ready error!\n"); fflush(stdout); // 375
return NULL; // 376
}
should have been shown in the terminal.
But shouldn't it matter what environments you have, as long as you choose the right ones when building the library?
I am using the current master despite adding the lines that you asked me to. It is possible that some auto-linting misaligned some of the stuff. Now I added a print statement before the if statement. So the line count is off by one:
printf("Before PyType_Ready\n"); // 375
if (PyType_Ready(&apriltagType) < 0) // 376
{ // 377
printf("PyType_Ready error!\n"); // 378
fflush(stdout); // 379
return NULL; // 380
} // 381
lldb /opt/homebrew/Caskroom/mambaforge/base/envs/ros_env/bin/python
(lldb) target create "/opt/homebrew/Caskroom/mambaforge/base/envs/ros_env/bin/python"
Current executable set to '/opt/homebrew/Caskroom/mambaforge/base/envs/ros_env/bin/python' (arm64).
(lldb) run bin_test/test.py
Process 43823 launched: '/opt/homebrew/Caskroom/mambaforge/base/envs/ros_env/bin/python' (arm64)
Before PyType_Ready
Process 43823 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x10)
frame #0: 0x000000010137f290 libpython3.11.dylib`type_ready + 92
libpython3.11.dylib`type_ready:
-> 0x10137f290 <+92>: ldr x8, [x8, #0x10]
0x10137f294 <+96>: ldr w9, [x8, #0x1428]
0x10137f298 <+100>: cbz w9, 0x10137f338 ; <+260>
0x10137f29c <+104>: sub w9, w9, #0x1
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x10)
* frame #0: 0x000000010137f290 libpython3.11.dylib`type_ready + 92
frame #1: 0x000000010138fde8 libpython3.11.dylib`PyType_Ready + 52
frame #2: 0x00000001008c1780 apriltag.cpython-311-darwin.so`PyInit_apriltag at apriltag_pywrap.c:376:9 [opt]
frame #3: 0x000000010019bbc0 python`_imp_create_dynamic + 1188
frame #4: 0x00000001000b8e20 python`cfunction_vectorcall_FASTCALL + 256
frame #5: 0x0000000100164b6c python`_PyEval_EvalFrameDefault + 55160
frame #6: 0x0000000100166e18 python`_PyEval_Vector + 184
frame #7: 0x000000010006286c python`object_vacall + 316
frame #8: 0x0000000100062668 python`PyObject_CallMethodObjArgs + 108
frame #9: 0x0000000100196c04 python`PyImport_ImportModuleLevelObject + 1580
frame #10: 0x000000010015f0d0 python`_PyEval_EvalFrameDefault + 31964
frame #11: 0x0000000100156424 python`PyEval_EvalCode + 220
frame #12: 0x00000001001bc3f8 python`run_mod + 144
frame #13: 0x00000001001bbe58 python`_PyRun_SimpleFileObject + 1260
frame #14: 0x00000001001baf18 python`_PyRun_AnyFileObject + 240
frame #15: 0x00000001001e192c python`Py_RunMain + 3100
frame #16: 0x00000001001e2784 python`pymain_main + 1252
frame #17: 0x0000000100003684 python`main + 56
frame #18: 0x0000000182b6f154 dyld`start + 2476
But shouldn't it matter what environments you have, as long as you choose the right ones when building the library?
No. The build and runtime environments have to be the same. Linking against a different library than you use later can cause different kinds of ABI incompatibilities.
I am using the current master despite adding the lines that you asked me to. It is possible that some auto-linting misaligned some of the stuff.
Well, ideally your backtraces match with the source code of the repo. Otherwise, it's hard to use it to debug what is going on.
If the crash is indeed inside PyType_Ready
, then there is nothing the apriltag bindings can do to fix this.
If you can reduce the crash in the CI, I can have a look at it. But other than this, I recommend that you fix your Python environment.
To clarify what I meant was using the same build and runtime environments. Meaning that when I specify the ros_env
during building like this
cmake -B build -DCMAKE_BUILD_TYPE=Debug \
-DBUILD_SHARED_LIBS=ON \
-DBUILD_PYTHON_WRAPPER=ON \
-DPython3_EXECUTABLE=/opt/homebrew/Caskroom/mambaforge/base/envs/ros_env/bin/python \
-DPython3_INCLUDE_DIR=/opt/homebrew/Caskroom/mambaforge/base/envs/ros_env/include/python3.11 \
-DPython3_LIBRARY=/opt/homebrew/Caskroom/mambaforge/base/envs/ros_env/lib/libpython3.11.dylib \
-DPython3_NUMPY_INCLUDE_DIR=/opt/homebrew/Caskroom/mambaforge/base/envs/ros_env/lib/python3.11/site-packages/numpy/core/include
and using this specify environment during the run then it should not matter that there are other environments installed on the machine right?
I am not sure how this works with the mixed homebrew and mambaforge environment. At least on Linux, if you have set the environment, e.g. conda, correctly, then you also don't need to point to absolute paths for Python manually since CMake will find the FindPython3.cmake
etc. and set these variables accordingly. But I only have professional experience on Linux / Ubuntu and macOS just might work differently here.
Since your environment is named ros_env
, I am wondering, are you trying to use ROS on macOS? If so, you might want to look into conda and the RoboStack, which provides a lot of the ROS packages in a conda environment. I had very good experiences with using different Python stacks in a conda environment, including ROS.
Hello, I am trying to build the Python bindings for macOS. The building process works but when trying to import the library, there is always a segmentation fault. Using Python 3.12.6 with a conda environment. I was able to build the duckietown bindings but I receive a lot of
Error, more than one new minimum found.
errors with them. I did not find an actual solution for this and it seems that this error does not exists with the offical bindings.If I should provide more system information please let me know.