dmlc / dgl

Python package built to ease deep learning on graph, on top of existing DL frameworks.
http://dgl.ai
Apache License 2.0
13.55k stars 3.02k forks source link

Building on Mac M3: Graphbolt forces OPENMP and CUDA on #7833

Open GMNGeoffrey opened 2 weeks ago

GMNGeoffrey commented 2 weeks ago

🐛 Bug

I followed the instructions to build from source at https://docs.dgl.ai/install/index.html#macos and the build failed because it couldn't find OpenMP (which it was searching for even though that is explicitly set to OFF). The trace leads to the graphbolt directory CMakeLists.txt. USE_OPENMP is set to ON at this point for some reason (confirmed by sticking a message call in there). Throwing set(OPEN_MP OFF) immediately above this condition avoids the error, but then there are more errors stemming from graphbolt when building CUDA, even though USE_CUDA is OFF by default. Setting -DBUILD_GRAPHBOLT=OFF gets around these issues, but then the build fails trying to build tests/cpp/test_spmat_coo.cc which #includes omp.h. Setting -DBUILD_CPP_TEST=OFF got around this issue.

To Reproduce

Steps to reproduce the behavior:

  1. On Mac M3, create a fresh clone of dgl

  2. Run commands to build from the documentation:

    mkdir build
    cd build
    cmake -DUSE_OPENMP=off -DUSE_LIBXSMM=OFF ..
    make -j4 # or -j16 or whatever if you have the cores
  3. Get error about failing to find OpenMP:

[ 36%] Built target tensoradapter_pytorch
CMake Error at /Users/gcmn/src/dgl/.venv/lib/python3.12/site-packages/cmake/data/share/cmake-3.30/Modules/FindPackageHandleStandardArgs.cmake:233 (message):
  Could NOT find OpenMP_C (missing: OpenMP_C_FLAGS OpenMP_C_LIB_NAMES)
Call Stack (most recent call first):
  /Users/gcmn/src/dgl/.venv/lib/python3.12/site-packages/cmake/data/share/cmake-3.30/Modules/FindPackageHandleStandardArgs.cmake:603 (_FPHSA_FAILURE_MESSAGE)
  /Users/gcmn/src/dgl/.venv/lib/python3.12/site-packages/cmake/data/share/cmake-3.30/Modules/FindOpenMP.cmake:600 (find_package_handle_standard_args)
  CMakeLists.txt:91 (find_package)

-- Configuring incomplete, errors occurred!
make[2]: *** [CMakeFiles/graphbolt] Error 1
make[1]: *** [CMakeFiles/graphbolt.dir/all] Error 2

(aside: it's a bit weird that the stack trace here only lists the path as CMakeLists.txt, rather than graphbolt/CMakeLists.txt. It appears that a separate CMake configure step for graphbolt is getting invoked as part of the build step??).

  1. Hack USE_OPENMP OFF. I originally just put set(USE_OPENMP OFF) right before the if condition, but it also works to change the default for the option. You need to manually delete the separate build directory the graphbolt sub-build created though.
rm -rf ./* ../graphbolt/build  # from build/
cmake -DUSE_OPENMP=off -DUSE_LIBXSMM=OFF ..
make -j16
  1. Now you get an error about redefinition of strstr.
In file included from /Users/gcmn/src/dgl/graphbolt/src/cache_policy.cc:20:
In file included from /Users/gcmn/src/dgl/graphbolt/src/./cache_policy.h:28:
In file included from /Users/gcmn/src/dgl/graphbolt/../third_party/cccl/libcudacxx/include/cuda/std/atomic:41:
In file included from /Users/gcmn/src/dgl/graphbolt/../third_party/cccl/libcudacxx/include/cuda/std/__atomic/wait/polling.h:26:
In file included from /Users/gcmn/src/dgl/graphbolt/../third_party/cccl/libcudacxx/include/cuda/std/__atomic/types.h:24:
In file included from /Users/gcmn/src/dgl/graphbolt/../third_party/cccl/libcudacxx/include/cuda/std/__atomic/types/base.h:25:
In file included from /Users/gcmn/src/dgl/graphbolt/../third_party/cccl/libcudacxx/include/cuda/std/__atomic/types/common.h:28:
In file included from /Users/gcmn/src/dgl/graphbolt/../third_party/cccl/libcudacxx/include/cuda/std/detail/libcxx/include/cstring:72:
/Users/gcmn/src/dgl/graphbolt/../third_party/cccl/libcudacxx/include/cuda/std/detail/libcxx/include/string.h:142:75: error: redefinition of 'strstr'
inline _LIBCUDACXX_INLINE_VISIBILITY _LIBCUDACXX_PREFERRED_OVERLOAD char* strstr(char* __s1, const char* __s2)
                                                                          ^
/Library/Developer/CommandLineTools/SDKs/MacOSX14.4.sdk/usr/include/c++/v1/string.h:104:63: note: previous definition is here
inline _LIBCPP_HIDE_FROM_ABI _LIBCPP_PREFERRED_OVERLOAD char* strstr(char* __s1, const char* __s2) {
                                                              ^
10 errors generated.
make[5]: *** [CMakeFiles/graphbolt_pytorch_2.5.1.dir/src/cache_policy.cc.o] Error 1
make[4]: *** [CMakeFiles/graphbolt_pytorch_2.5.1.dir/all] Error 2
make[3]: *** [all] Error 2
make[2]: *** [CMakeFiles/graphbolt] Error 2
make[1]: *** [CMakeFiles/graphbolt.dir/all] Error 2
  1. Try again with graphbolt off
rm -rf ./* ../graphbolt/build  # from build/
cmake -DUSE_OPENMP=off -DUSE_LIBXSMM=OFF -DBUILD_GRAPHBOLT=OFF ..
make -j16
  1. Now you get an error about a graphbolt test:
/Users/gcmn/src/dgl/tests/cpp/test_spmat_coo.cc:4:10: fatal error: 'omp.h' file not found
#include <omp.h>
         ^~~~~~~
1 error generated.
make[2]: *** [CMakeFiles/runUnitTests.dir/tests/cpp/test_spmat_coo.cc.o] Error 1
  1. Try again without c++ tests
rm -rf ./* # from build/
cmake -DUSE_OPENMP=OFF -DUSE_LIBXSMM=OFF -DBUILD_GRAPHBOLT=OFF -DBUILD_CPP_TEST=OFF ..
make -j16
  1. Success!

Expected behavior

A build with USE_OPENMP and USE_CUDA set to OFF should not build or look for OpenMP or CUDA and should not build C++ tests that require OpenMP. A build with BUILD_GRAPHBOLT set to OFF should not try to build tests under graphbolt/.

Environment