PRBonn / kiss-icp

A LiDAR odometry pipeline that just works
https://www.ipb.uni-bonn.de/wp-content/papercite-data/pdf/vizzo2023ral.pdf
MIT License
1.46k stars 301 forks source link

Mac M1: python3.9 and python3.10 packages missing #18

Closed bexcite closed 1 year ago

bexcite commented 1 year ago

Tried to install on Mac M1 arm and because it doesn't have pre-compiled wheels, the attempt to compile locally also failed.

Collecting scipy
  Using cached scipy-1.9.2.tar.gz (42.1 MB)
  Installing build dependencies ... done
  Getting requirements to build wheel ... error
  error: subprocess-exited-with-error

  × Getting requirements to build wheel did not run successfully.
  │ exit code: 1
  ╰─> [66 lines of output]
      The Meson build system
      Version: 0.63.3
      Source dir: /private/var/folders/k2/06nf1_6s039601g5ply3b63h00054w/T/pip-install-ovshm20c/scipy_fb0ee8bf9ea2434bb6b7426918caf471
      Build dir: /private/var/folders/k2/06nf1_6s039601g5ply3b63h00054w/T/pip-install-ovshm20c/scipy_fb0ee8bf9ea2434bb6b7426918caf471/.mesonpy-o2z7u9kx/build
      Build type: native build
      Project name: SciPy
      Project version: 1.9.2
      C compiler for the host machine: cc (clang 12.0.5 "Apple clang version 12.0.5 (clang-1205.0.22.9)")
      C linker for the host machine: cc ld64 650.9
      C++ compiler for the host machine: c++ (clang 12.0.5 "Apple clang version 12.0.5 (clang-1205.0.22.9)")
      C++ linker for the host machine: c++ ld64 650.9
      Host machine cpu family: aarch64
      Host machine cpu: arm64
      Compiler for C supports arguments -Wno-unused-but-set-variable: NO
      Compiler for C supports arguments -Wno-unused-but-set-variable: NO (cached)
      Compiler for C supports arguments -Wno-unused-function: YES
      Compiler for C supports arguments -Wno-conversion: YES
      Compiler for C supports arguments -Wno-misleading-indentation: YES
      Compiler for C supports arguments -Wno-incompatible-pointer-types: YES
      Library m found: YES
      Fortran compiler for the host machine: gfortran (gcc 12.1.0 "GNU Fortran (Homebrew GCC 12.1.0) 12.1.0")
      Fortran linker for the host machine: gfortran ld64 650.9
      Compiler for Fortran supports arguments -Wno-conversion: YES
      Program cython found: YES (/private/var/folders/k2/06nf1_6s039601g5ply3b63h00054w/T/pip-build-env-k6wcpwx6/overlay/bin/cython)
      Program pythran found: YES (/private/var/folders/k2/06nf1_6s039601g5ply3b63h00054w/T/pip-build-env-k6wcpwx6/overlay/bin/pythran)
      Program cp found: YES (/bin/cp)
      Program python found: YES (some.venv-kiss-39/bin/python3.9)
      Found pkg-config: /opt/homebrew/bin/pkg-config (0.29.2)
      Library npymath found: YES
      Library npyrandom found: YES
      Found CMake: /opt/homebrew/bin/cmake (3.24.0)
      Run-time dependency openblas found: NO (tried pkgconfig, framework and cmake)
      Run-time dependency openblas found: NO (tried pkgconfig, framework and cmake)

      ../../scipy/meson.build:129:0: ERROR: Dependency "OpenBLAS" not found, tried pkgconfig, framework and cmake

Or it's probably because scipy doesn't support Mac M1 yet? idk. But just leave it here for visibility since you did a great job of Mac x64 support as I can see from wheels available on PyPi and maybe this is interesting to other people as well.

nachovizzo commented 1 year ago

This is clearly a missing package on the scipy for that specific version of pyhton. You better check on the scipy project and try luck there :) since there is nothing we can do on our side

bexcite commented 1 year ago

eh, scipy, transform, Rotation ... and it drags the whole OpenBLAS with it. I know this pain, we are building on M1 for arm too. Anyway we can consider it closed, and fingers crossed scipy will add a native M1 arm support better. Thanks!

bexcite commented 1 year ago

UPDATE: Some progress but still seems something with kiss_icp compiled libs too.

SciPy thing was fixed with updating the macOS BigSur 11.6 to 11.7 and then to macOS Monterey 12.6, because scipy 1.9.3 has nice __macosx_12_0_arm64.whl available from PyPi. So far so good until kiss_icp start their imports.

Here is the relevant minimal snipet with an error:

python
Python 3.10.8 (main, Oct 13 2022, 09:48:40) [Clang 14.0.0 (clang-1400.0.29.102)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import kiss_icp
>>> import kiss_icp.config
>>> import kiss_icp.odometry
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/swt/.venv-kiss-310/lib/python3.10/site-packages/kiss_icp/odometry.py", line 4, in <module>
    from kiss_icp.deskew import MotionCompensator, StubCompensator
  File "/swt/.venv-kiss-310/lib/python3.10/site-packages/kiss_icp/deskew.py", line 4, in <module>
    from kiss_icp.pybind import kiss_icp_pybind
ImportError: dlopen(//swt/.venv-kiss-310/lib/python3.10/site-packages/kiss_icp/pybind/kiss_icp_pybind.cpython-310-darwin.so, 0x0002): symbol not found in flat namespace (__ZN3tbb4task13note_affinityEt)

I tried python3.9, with the same results. Also checked that I've got the latest python versions available from brew.

MacBook Pro (13-inch, M1, 2020)
Apple M1
nachovizzo commented 1 year ago

Hello, sorry for the late reply! I was out of the office...

Thanks for the detailed output, I guess the python-side of things you have it working. Now the problem is on the C++ side. Once we release the code, you will be able to compile from source, and this concern might be gone.

Unfortunately, I do not have a M1 macOs myself, I can only cross-compile and hope for the best. The error is on the TBB (multithread backcend) tbb::task::note_affinity. So, I guess again that the problem might not be something I can solve myself bur rather the intelTBB team.

In the meantime, can you confirm if you can install tbb by yourself? Additionally, could you provide the output of ldd //swt/.venv-kiss-310/lib/python3.10/site-packages/kiss_icp/pybind/kiss_icp_pybind.cpython-310-darwin.so?

bexcite commented 1 year ago

Oh, sorry, missed your message. Here the outputs (otool is what is ldd is doing but for macOS):

otool -L .venv-kiss-310/lib/python3.10/site-packages/kiss_icp/pybind/kiss_icp_pybind.cpython-310-darwin.so
.venv-kiss-310/lib/python3.10/site-packages/kiss_icp/pybind/kiss_icp_pybind.cpython-310-darwin.so:
    /usr/lib/libc++.1.dylib (compatibility version 1.0.0, current version 1300.23.0)
    /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1311.100.3)

and sybmols with tbb:::

nm -aC .venv-kiss-310/lib/python3.10/site-packages/kiss_icp/pybind/kiss_icp_pybind.cpython-310-darwin.so | grep "tbb::"
                 U tbb::task_group_context::init()
                 U tbb::task_group_context::~task_group_context()
                 U tbb::task::note_affinity(unsigned short)
                 U tbb::internal::get_initial_auto_partitioner_divisor()
                 U tbb::task_group_context::is_group_execution_cancelled() const
                 U tbb::internal::allocate_child_proxy::allocate(unsigned long) const
                 U tbb::internal::allocate_continuation_proxy::free(tbb::task&) const
                 U tbb::internal::allocate_continuation_proxy::allocate(unsigned long) const
                 U tbb::internal::allocate_root_with_context_proxy::free(tbb::task&) const
                 U tbb::internal::allocate_root_with_context_proxy::allocate(unsigned long) const
                 U typeinfo for tbb::task

and they are undefined, so probably was linked in some strange form

nachovizzo commented 1 year ago

Ok, so I can't test this, but you could give it a try.

If I build on a macOS machine (no arm), after installing tbb locally brew install tbb I get a proper python wheel which I can use, and the shared library is dynamically linked to the tbb library:

$ otool -L kiss_icp_pybind.cpython-310-darwin.so:
        /usr/local/opt/tbb/lib/libtbb.12.dylib (compatibility version 12.0.0, current version 12.7.0)
        /usr/lib/libc++.1.dylib (compatibility version 1.0.0, current version 1300.23.0)
        /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1311.100.3)

On the other hand, when I cross-compile it, with the same platform (but of course, can't test) for arm64. I also get a wheel, and in contrast to all the other wheels, I will NOT static-link tbb, but I still can't see the shared tbb library in the .so file:

$ otool -L kiss_icp-0.0.7-cp310-cp310-macosx_11_0_arm64/kiss_icp/pybind/kiss_icp_pybind.cpython-310-darwin.so
kiss_icp-0.0.7-cp310-cp310-macosx_11_0_arm64/kiss_icp/pybind/kiss_icp_pybind.cpython-310-darwin.so:
        /usr/lib/libc++.1.dylib (compatibility version 1.0.0, current version 1300.23.0)
        /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1311.100.3)

Apparently static-linking is not working, and dynamic also it's not. It's a bit strange, but I can't do much without a mac and I guess you can't do much without the full source-code (coming soon, hopefully!)

In any case, you can download these wheels and give it a try

"dynamic" linked tbb, arm64:

kiss_icp-0.0.7-cp310-cp310-macosx_11_0_arm64.zip

static linked tbb, arn64:

kiss_icp-0.0.7-cp310-cp310-macosx_11_0_arm64_static.zip

Please let me know if one of those works, so I can make a new release :)

bexcite commented 1 year ago

Here tests for static linked tbb, arm64:

otool -L .venv-310-2/lib/python3.10/site-packages/kiss_icp/pybind/kiss_icp_pybind.cpython-310-darwin.so
.venv-310-2/lib/python3.10/site-packages/kiss_icp/pybind/kiss_icp_pybind.cpython-310-darwin.so:
    /usr/lib/libc++.1.dylib (compatibility version 1.0.0, current version 1300.23.0)
    /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1311.100.3)

nm -aC .venv-310-2/lib/python3.10/site-packages/kiss_icp/pybind/kiss_icp_pybind.cpython-310-darwin.so | grep tbb::
                 U tbb::task_group_context::init()
                 U tbb::task_group_context::~task_group_context()
                 U tbb::task::note_affinity(unsigned short)
                 U tbb::internal::get_initial_auto_partitioner_divisor()
                 U tbb::task_group_context::is_group_execution_cancelled() const
                 U tbb::internal::allocate_child_proxy::allocate(unsigned long) const
                 U tbb::internal::allocate_continuation_proxy::free(tbb::task&) const
                 U tbb::internal::allocate_continuation_proxy::allocate(unsigned long) const
                 U tbb::internal::allocate_root_with_context_proxy::free(tbb::task&) const
                 U tbb::internal::allocate_root_with_context_proxy::allocate(unsigned long) const
                 U typeinfo for tbb::task

so it's the same as before, "static linking is not working for tbb :(" and installing brew install tbb doesn't help for this case (no lib is searched which makes sense, but I still tried just in case).

python -c "import kiss_icp.odometry"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/Users/pavlo.bashmakov/code/swt/.venv-310-2/lib/python3.10/site-packages/kiss_icp/odometry.py", line 27, in <module>
    from kiss_icp.deskew import MotionCompensator, StubCompensator
  File "/Users/pavlo.bashmakov/code/swt/.venv-310-2/lib/python3.10/site-packages/kiss_icp/deskew.py", line 27, in <module>
    from kiss_icp.pybind import kiss_icp_pybind
ImportError: dlopen(/Users/pavlo.bashmakov/code/swt/.venv-310-2/lib/python3.10/site-packages/kiss_icp/pybind/kiss_icp_pybind.cpython-310-darwin.so, 0x0002): symbol not found in flat namespace (__ZN3tbb4task13note_affinityEt)

And the dynamically linked which us almost the same, no lib visible in otool -L but slight different error (for different class, which tells that compilation was indeed somewhat different):

otool -L .venv-310-3/lib/python3.10/site-packages/kiss_icp/pybind/kiss_icp_pybind.cpython-310-darwin.so
.venv-310-3/lib/python3.10/site-packages/kiss_icp/pybind/kiss_icp_pybind.cpython-310-darwin.so:
    /usr/lib/libc++.1.dylib (compatibility version 1.0.0, current version 1300.23.0)
    /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1311.100.3)

and

nm -aC .venv-310-3/lib/python3.10/site-packages/kiss_icp/pybind/kiss_icp_pybind.cpython-310-darwin.so | grep tbb::
                 U tbb::detail::r1::deallocate(tbb::detail::d1::small_object_pool&, void*, unsigned long, tbb::detail::d1::execution_data const&)
                 U tbb::detail::r1::initialize(tbb::detail::d1::task_group_context&)
                 U tbb::detail::r1::execution_slot(tbb::detail::d1::execution_data const*)
                 U tbb::detail::r1::notify_waiters(unsigned long)
                 U tbb::detail::r1::max_concurrency(tbb::detail::d1::task_arena_base const*)
                 U tbb::detail::r1::execute_and_wait(tbb::detail::d1::task&, tbb::detail::d1::task_group_context&, tbb::detail::d1::wait_context&, tbb::detail::d1::task_group_context&)
                 U tbb::detail::r1::is_group_execution_cancelled(tbb::detail::d1::task_group_context&)
                 U tbb::detail::r1::spawn(tbb::detail::d1::task&, tbb::detail::d1::task_group_context&)
                 U tbb::detail::r1::destroy(tbb::detail::d1::task_group_context&)
                 U tbb::detail::r1::allocate(tbb::detail::d1::small_object_pool*&, unsigned long)
                 U tbb::detail::r1::allocate(tbb::detail::d1::small_object_pool*&, unsigned long, tbb::detail::d1::execution_data const&)

Super interesting that output of undefined symbol tables are not intersecting in static vs dynamic, which leads to a crazy idea that if you can try to compile static + dynamic together (don't know how to do it exactly but sometimes lib are having really convoluted ways). Here together outputs of static (top) vs dynamic (bottom):

nm -aC .venv-310-2/lib/python3.10/site-packages/kiss_icp/pybind/kiss_icp_pybind.cpython-310-darwin.so | grep tbb::
                 U tbb::task_group_context::init()
                 U tbb::task_group_context::~task_group_context()
                 U tbb::task::note_affinity(unsigned short)
                 U tbb::internal::get_initial_auto_partitioner_divisor()
                 U tbb::task_group_context::is_group_execution_cancelled() const
                 U tbb::internal::allocate_child_proxy::allocate(unsigned long) const
                 U tbb::internal::allocate_continuation_proxy::free(tbb::task&) const
                 U tbb::internal::allocate_continuation_proxy::allocate(unsigned long) const
                 U tbb::internal::allocate_root_with_context_proxy::free(tbb::task&) const
                 U tbb::internal::allocate_root_with_context_proxy::allocate(unsigned long) const
                 U typeinfo for tbb::task

vs

nm -aC .venv-310-3/lib/python3.10/site-packages/kiss_icp/pybind/kiss_icp_pybind.cpython-310-darwin.so | grep tbb::
                 U tbb::detail::r1::deallocate(tbb::detail::d1::small_object_pool&, void*, unsigned long, tbb::detail::d1::execution_data const&)
                 U tbb::detail::r1::initialize(tbb::detail::d1::task_group_context&)
                 U tbb::detail::r1::execution_slot(tbb::detail::d1::execution_data const*)
                 U tbb::detail::r1::notify_waiters(unsigned long)
                 U tbb::detail::r1::max_concurrency(tbb::detail::d1::task_arena_base const*)
                 U tbb::detail::r1::execute_and_wait(tbb::detail::d1::task&, tbb::detail::d1::task_group_context&, tbb::detail::d1::wait_context&, tbb::detail::d1::task_group_context&)
                 U tbb::detail::r1::is_group_execution_cancelled(tbb::detail::d1::task_group_context&)
                 U tbb::detail::r1::spawn(tbb::detail::d1::task&, tbb::detail::d1::task_group_context&)
                 U tbb::detail::r1::destroy(tbb::detail::d1::task_group_context&)
                 U tbb::detail::r1::allocate(tbb::detail::d1::small_object_pool*&, unsigned long)
                 U tbb::detail::r1::allocate(tbb::detail::d1::small_object_pool*&, unsigned long, tbb::detail::d1::execution_data const&)

Thanks for trying to solve it!

nachovizzo commented 1 year ago

@bjajoh hopefully in one month or so. The paper is still under revision and we won't release the implementation until it gets through. Thanks for understanding :)

nachovizzo commented 1 year ago

@bjajoh we where not planing to, but if you need it please tell me know:

This way I can bake a wheel package so you can try it right away

nachovizzo commented 1 year ago

@bjajoh @bexcite Sorry for the delay. I just uploaded a new version for the package where I have reduced drastically the dependencies of the python API. For now, as long as you can install numpy and pyyaml on your target device, the rest should be possible, since those are the only binary-code python pacakges that we have as dependencies.

These are the new wheels:

I will close this issue for now since the other problems I hope to be solved once the code is open-source

nachovizzo commented 1 year ago

@bexcite I just open-sourced the implementation v0.0.13

Could you somehow check now that with the entire source code you can build and run on your end? Thanks!

nachovizzo commented 1 year ago

I'm still facing this error, probably interesting to investigate in #113 and stop using the static tbb.

nachovizzo commented 1 year ago

132 did not fix the issue. I get from the pypi pkgs the following import error

ImportError: dlopen(/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/kiss_icp/pybind/kiss_icp_pybind.cpython-310-darwin.so,
0x0002): symbol not found in flat namespace '__ZN3tbb6detail2r110deallocateERNS0_2d117small_object_poolEPvmRKNS2_14execution_dataE'

I hope that #127 would fix this, since I fail to understand what the f**k is going on with mac!

nachovizzo commented 1 year ago

@bexcite as of release 0.2.9 now the python package is working, at least on my M2 mac.

Just pip install -U kiss-icp and no more undefined reference for me ;) Honestly I'm not 100% sure what I've changed. The tbb library is now being populated with FetchContentDeclare instead of ExternalProject. Which looks to be the reason behind the fact that cmake might be changing something in the binary artifact of the tbb library.

If you want it would be nice if you can confirm is also working on your side!

EDIT: turns out that everything was related to the APPLE "-mmacosx-version-min compiler flag. Apparently the hand-baked tbb external project I had before was not properlly populating the cmake cache. More reasons to support moving everything to FetchContentDeclare and let cmake run #129 + #143

Withouth this flag set, the pybind build was already complaining in the CI : 2023-04-11T15:31:00.8507880Z ld: warning: object file (/Users/runner/work/kiss-icp/kiss-icp/python/_skbuild/macosx-11.0-x86_64-3.11/cmake-build/kiss_icp/tbb/lib/libtbb.a(small_object_pool.cpp.o)) was built for newer macOS version (11.7) than being linked (11.0)

Looks like the TBB team already struggled with this and provide support on their build system here

# Enable support of minimum supported macOS version flag
if (APPLE)
    if (NOT CMAKE_CXX_OSX_DEPLOYMENT_TARGET_FLAG)
        set(CMAKE_CXX_OSX_DEPLOYMENT_TARGET_FLAG "-mmacosx-version-min=" CACHE STRING "Minimum macOS version flag")
    endif()
    if (NOT CMAKE_C_OSX_DEPLOYMENT_TARGET_FLAG)
        set(CMAKE_C_OSX_DEPLOYMENT_TARGET_FLAG "-mmacosx-version-min=" CACHE STRING "Minimum macOS version flag")
    endif()
endif()
bexcite commented 1 year ago

@nachovizzo I can confirm, 0.2.9 is working finely from PyPi on my M1 machine!!!!!

Many thanks to figuring this out and writing the analysis. Kudos!!