marian-nmt / marian

Fast Neural Machine Translation in C++
https://marian-nmt.github.io
Other
1.22k stars 228 forks source link

OSx installation fails with clang error #325

Open sshleifer opened 4 years ago

sshleifer commented 4 years ago

I have been struggling for a few hours to install on OSX and was wondering whether you guys have any tips.

Cmake seems to terminate successfully, but then make -j4 breaks.

cmake .. -DCOMPILE_CUDA=off

cmake output

-- Project name: marian
-- Project version: v1.9.0+3c7a88f4
CMake Warning at CMakeLists.txt:55 (message):
  CMAKE_BUILD_TYPE not set; setting to Release

-- Checking support for CPU intrinsics
-- Could not find hardware support for AVX2 on this machine.
-- Could not find hardware support for AVX512 on this machine.
-- SSE2 support found
-- SSE3 support found
-- SSE4.1 support found
-- AVX support found
CMake Warning at CMakeLists.txt:293 (message):
  COMPILE_CUDA=off : Building only CPU version

-- Found Tcmalloc: /usr/local/lib/libtcmalloc_minimal.dylib
-- Found Doxygen: /usr/local/bin/doxygen (found version "1.8.17") found components: doxygen missing components: dot
-- Configuring done
-- Generating done
-- Build files have been written to: /Users/shleifer/marian/build

make is all green until 94% then fails with

 [94%] Linking CXX executable ../marian-conv
ld: unknown option: --start-group
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make[2]: *** [marian-vocab] Error 1
make[1]: *** [src/CMakeFiles/marian_vocab.dir/all] Error 2

(make -j4 fails similarly) Has anyone seen anything like this?

Environment:

Apple clang version 11.0.0 (clang-1100.0.33.17)
Target: x86_64-apple-darwin19.0.0
Thread model: posix
InstalledDir: /Library/Developer/CommandLineTools/usr/bin
emjotde commented 4 years ago

@XapaJIaMnu Any hints?

XapaJIaMnu commented 4 years ago

@sshleifer What version of MacOs are you using? I have tested the compilation with:

xapajiamnu@dhcp-91-025 marian-dev % sw_vers -productVersion
10.15.3
xapajiamnu@dhcp-91-025 marian-dev % clang++ --version
Apple clang version 11.0.3 (clang-1103.0.32.29)
Target: x86_64-apple-darwin19.3.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

Which seems to be slightly newer than yours. My guess would be that your (very slightly outdated) version of llvm doesn't support the --start-group switch. Is upgrade possible for you?

For a short term solution do the following: 1) Compile as normally (cmake -DCOMPILE_CUDA=OFF) and make -j4 2) Once compilation fails run make VERBOSE=1 This will output the exact compile commands on the terminal. Now copy those to a text editor, and remove the --start-group flag from each of them. Then, paste them into the terminal and that will allow you to manually finish the compilation.

If this fixes the problem, great! We would be able to add this to some exceptions, but it is annoying when we can't reproduce this on our side.

sshleifer commented 4 years ago

I upgrade to 10.15.4 and it didn't work -> identical error message. First offending line:

[ 94%] Linking CXX executable ../marian-conv
cd /Users/shleifer/marian/build/src && /usr/local/Cellar/cmake/3.16.2/bin/cmake -E cmake_link_script CMakeFiles/marian_conv.dir/link.txt --verbose=1
/Library/Developer/CommandLineTools/usr/bin/c++  -std=c++11 -pthread  -fPIC -Wno-unused-result -Wno-unknown-warning-option -Wno-unknown-cuda-version -march=native  -msse2 -msse3 -msse4.1 -mavx -DMKL_ILP64 -m64 -Ofast -m64 -funroll-loops -ffinite-math-only -g  -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk -Wl,-search_paths_first -Wl,-headerpad_max_install_names  CMakeFiles/marian_conv.dir/command/marian_conv.cpp.o  -o ../marian-conv  ../libmarian.a /usr/local/lib/libtcmalloc_minimal.dylib -Wl,--start-group /opt/intel/mkl/lib/libmkl_intel_ilp64.a /opt/intel/mkl/lib/libmkl_sequential.a /opt/intel/mkl/lib/libmkl_core.a -Wl,--end-group -liconv /usr/local/lib/libtcmalloc_minimal.dylib -Wl,--start-group /opt/intel/mkl/lib/libmkl_intel_ilp64.a /opt/intel/mkl/lib/libmkl_sequential.a /opt/intel/mkl/lib/libmkl_core.a -Wl,--end-group -liconv
ld: unknown option: --start-group
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make[2]: *** [marian-conv] Error 1
make[1]: *** [src/CMakeFiles/marian_conv.dir/all] Error 2
make: *** [all] Error 2

I have a successful build on a linux machine so probaly will not invest in removing --start-group, but thanks for the idea and the help!

XapaJIaMnu commented 4 years ago

I upgrade to 10.15.4 and it didn't work -> identical error message. First offending line:

[ 94%] Linking CXX executable ../marian-conv
cd /Users/shleifer/marian/build/src && /usr/local/Cellar/cmake/3.16.2/bin/cmake -E cmake_link_script CMakeFiles/marian_conv.dir/link.txt --verbose=1
/Library/Developer/CommandLineTools/usr/bin/c++  -std=c++11 -pthread  -fPIC -Wno-unused-result -Wno-unknown-warning-option -Wno-unknown-cuda-version -march=native  -msse2 -msse3 -msse4.1 -mavx -DMKL_ILP64 -m64 -Ofast -m64 -funroll-loops -ffinite-math-only -g  -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk -Wl,-search_paths_first -Wl,-headerpad_max_install_names  CMakeFiles/marian_conv.dir/command/marian_conv.cpp.o  -o ../marian-conv  ../libmarian.a /usr/local/lib/libtcmalloc_minimal.dylib -Wl,--start-group /opt/intel/mkl/lib/libmkl_intel_ilp64.a /opt/intel/mkl/lib/libmkl_sequential.a /opt/intel/mkl/lib/libmkl_core.a -Wl,--end-group -liconv /usr/local/lib/libtcmalloc_minimal.dylib -Wl,--start-group /opt/intel/mkl/lib/libmkl_intel_ilp64.a /opt/intel/mkl/lib/libmkl_sequential.a /opt/intel/mkl/lib/libmkl_core.a -Wl,--end-group -liconv
ld: unknown option: --start-group
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make[2]: *** [marian-conv] Error 1
make[1]: *** [src/CMakeFiles/marian_conv.dir/all] Error 2
make: *** [all] Error 2

I have a successful build on a linux machine so probaly will not invest in removing --start-group, but thanks for the idea and the help!

Could you double check that your clang++ version? Do you by any chance have ld be a different version from clang++?

rakeshchada commented 4 years ago

@sshleifer : I had faced the same error on my mac OS (Version 10.14.6). After some insane googling, I finally managed to fix it by removing the "Wl,--start-group,--end-group" flags in each module's "link.txt" file. FYI to make it clear - You still keep the values after the flags and just remove the flags.

XapaJIaMnu commented 4 years ago

@ugermann can we add those to ignore list?

ugermann commented 4 years ago
git blame cmake/FindMKL.cmake

look for --start-group

ugermann commented 4 years ago

It's a mess. Apparently Intel does not support cmake, so you have to go to a web site and select your OS, compiler, Intel MKL version etc. from dropdown lists to get a custom link line. https://software.intel.com/en-us/articles/intel-mkl-link-line-advisor.

That appears to be Intel's official approach to promoting software portability ... :man_facepalming:

There's a zoo of FindMKL.cmake files out there, so there may be something better out there. How important is it to have MKL on Mac? There is a USE_MKL option in the CMakeList.txt. And while clang apparently supports --{start|end}-group, it does not on Mac. My suggestion would be to configure on MAC with

-DUSE_MKL=off

for the time being. Mac users will be so busy adoring their machine that they won't notice the speed difference anyway :grin:.

ugermann commented 4 years ago

We could try using this instead: https://github.com/pytorch/pytorch/blob/master/cmake/Modules/FindMKL.cmake

emjotde commented 4 years ago

Might be the third or fourth iteration :) Let's just be careful to not mess up everyone else's MKL finding in the process.

ugermann commented 4 years ago

That's exactly why I'd prefer to stay out of this ...

kadir-gunel commented 3 years ago

Hello, Is it possible to build marian on mac with gpu support ? (Didn't want to create an issue for that ) I know that the latest architectures are not supported but older ones, like Pascal, still work on mac os High sierra.

XapaJIaMnu commented 3 years ago

@kadir-gunel , since CUDA stopped being supported on mac since High Sierra, and we don't have older test macs with GPU drivers, we haven't tried doing a mac build with cuda. If you have a GPU mac with high sierra lying around, you can try compiling it with the latest supported cuda and hope.

kadir-gunel commented 3 years ago

@XapaJIaMnu thank you. Yes, this is exactly my case having high sierra with 1080ti. I will take a shot.