beagle-dev / beagle-lib

general purpose library for evaluating the likelihood of sequence evolution on trees
MIT License
125 stars 57 forks source link

Allow building without `-march=native`. Consider supporting builds for systems without SSE4.1. #189

Open cho-m opened 1 year ago

cho-m commented 1 year ago

Currently, BEAGLE will always build with -march=native when the compiler supports it:

https://github.com/beagle-dev/beagle-lib/blob/2af91163d48bed8edfbf64af46d5877305546fd1/CMakeLists.txt#L14

https://github.com/beagle-dev/beagle-lib/blob/2af91163d48bed8edfbf64af46d5877305546fd1/CMakeLists.txt#L119-L123

This makes it trickier to distribute binaries as users' systems may need to minimally match SIMD support of host.

In Homebrew, this may be the reason why we are occasionally hitting segfaults when testing packages that use BEAGLE (e.g. BEAST and MrBayes).


When experimenting with disabling BEAGLE_OPTIMIZE_FOR_NATIVE_ARCH, I saw the compilation would fail on _mm_dp_pd: https://github.com/beagle-dev/beagle-lib/blob/2af91163d48bed8edfbf64af46d5877305546fd1/libhmsbeagle/CPU/BeagleCPU4StateSSEImpl.hpp#L62

This appears to be a SSE4.1 intrinsic. So, when I removed -march=native, build was missing -msse4.1 needed to use the intrinsic.

In older BEAGLE releases, there was support for disabling SSE2: https://github.com/beagle-dev/beagle-lib/blob/35bbe781fd0faa265d94926264000118c234c772/configure.ac#L315

It would be nice to allow compiling without SSE4.1.


Specifically, in Homebrew, we want to build Linux binaries for Core2 (up to SSSE3) and macOS binaries for Nehalem (up to SSE4.2).

cho-m commented 1 year ago

Snippet of cmake --build output where -march=native is being passed to every compilation:

[  5%] Building CXX object libhmsbeagle/CMakeFiles/hmsbeagle.dir/beagle.cpp.o
cd /tmp/beagle-20221202-16102-yndsrr/beagle-lib-4.0.0/build/libhmsbeagle && /home/linuxbrew/.linuxbrew/Homebrew/Library/Homebrew/shims/linux/super/g++-11 -DPACKAGE_BUGREPORT=\"beagle-dev@googlegroups.com\" -DPACKAGE_NAME=\"libhmsbeagle\" -DPACKAGE_STRING="\"libhmsbeagle 4.0.0\"" -DPACKAGE_TARNAME=\"libhmsbeagle\" -DPACKAGE_URL=\"\" -DPACKAGE_VERSION=\"4.0.0\" -DPLUGIN_VERSION=\"40\" -Dhmsbeagle_EXPORTS -I/tmp/beagle-20221202-16102-yndsrr/beagle-lib-4.0.0 -I/home/linuxbrew/.linuxbrew/opt/openjdk@11/libexec/include -std=c++11 -O3 -pthread -march=native -O2 -g -DNDEBUG -fPIC -std=gnu++11 -MD -MT libhmsbeagle/CMakeFiles/hmsbeagle.dir/beagle.cpp.o -MF CMakeFiles/hmsbeagle.dir/beagle.cpp.o.d -o CMakeFiles/hmsbeagle.dir/beagle.cpp.o -c /tmp/beagle-20221202-16102-yndsrr/beagle-lib-4.0.0/libhmsbeagle/beagle.cpp
[ 11%] Building CXX object libhmsbeagle/CPU/CMakeFiles/hmsbeagle-cpu.dir/BeagleCPUPlugin.cpp.o
cd /tmp/beagle-20221202-16102-yndsrr/beagle-lib-4.0.0/build/libhmsbeagle/CPU && /home/linuxbrew/.linuxbrew/Homebrew/Library/Homebrew/shims/linux/super/g++-11 -DPACKAGE_BUGREPORT=\"beagle-dev@googlegroups.com\" -DPACKAGE_NAME=\"libhmsbeagle\" -DPACKAGE_STRING="\"libhmsbeagle 4.0.0\"" -DPACKAGE_TARNAME=\"libhmsbeagle\" -DPACKAGE_URL=\"\" -DPACKAGE_VERSION=\"4.0.0\" -DPLUGIN_VERSION=\"40\" -Dhmsbeagle_cpu_EXPORTS -I/tmp/beagle-20221202-16102-yndsrr/beagle-lib-4.0.0 -I/home/linuxbrew/.linuxbrew/opt/openjdk@11/libexec/include -std=c++11 -O3 -pthread -march=native -O2 -g -DNDEBUG -fPIC -std=gnu++11 -MD -MT libhmsbeagle/CPU/CMakeFiles/hmsbeagle-cpu.dir/BeagleCPUPlugin.cpp.o -MF CMakeFiles/hmsbeagle-cpu.dir/BeagleCPUPlugin.cpp.o.d -o CMakeFiles/hmsbeagle-cpu.dir/BeagleCPUPlugin.cpp.o -c /tmp/beagle-20221202-16102-yndsrr/beagle-lib-4.0.0/libhmsbeagle/CPU/BeagleCPUPlugin.cpp
[ 17%] Building CXX object libhmsbeagle/CMakeFiles/hmsbeagle.dir/benchmark/BeagleBenchmark.cpp.o
cd /tmp/beagle-20221202-16102-yndsrr/beagle-lib-4.0.0/build/libhmsbeagle && /home/linuxbrew/.linuxbrew/Homebrew/Library/Homebrew/shims/linux/super/g++-11 -DPACKAGE_BUGREPORT=\"beagle-dev@googlegroups.com\" -DPACKAGE_NAME=\"libhmsbeagle\" -DPACKAGE_STRING="\"libhmsbeagle 4.0.0\"" -DPACKAGE_TARNAME=\"libhmsbeagle\" -DPACKAGE_URL=\"\" -DPACKAGE_VERSION=\"4.0.0\" -DPLUGIN_VERSION=\"40\" -Dhmsbeagle_EXPORTS -I/tmp/beagle-20221202-16102-yndsrr/beagle-lib-4.0.0 -I/home/linuxbrew/.linuxbrew/opt/openjdk@11/libexec/include -std=c++11 -O3 -pthread -march=native -O2 -g -DNDEBUG -fPIC -std=gnu++11 -MD -MT libhmsbeagle/CMakeFiles/hmsbeagle.dir/benchmark/BeagleBenchmark.cpp.o -MF CMakeFiles/hmsbeagle.dir/benchmark/BeagleBenchmark.cpp.o.d -o CMakeFiles/hmsbeagle.dir/benchmark/BeagleBenchmark.cpp.o -c /tmp/beagle-20221202-16102-yndsrr/beagle-lib-4.0.0/libhmsbeagle/benchmark/BeagleBenchmark.cpp

Error when building without -march=native:

  In file included from /tmp/beagle-20221210-16102-8nngbe/beagle-lib-4.0.0/libhmsbeagle/CPU/SSEDefinitions.h:66,
                   from /tmp/beagle-20221210-16102-8nngbe/beagle-lib-4.0.0/libhmsbeagle/CPU/BeagleCPU4StateSSEImpl.hpp:34,
                   from /tmp/beagle-20221210-16102-8nngbe/beagle-lib-4.0.0/libhmsbeagle/CPU/BeagleCPU4StateSSEImpl.h:338,
                   from /tmp/beagle-20221210-16102-8nngbe/beagle-lib-4.0.0/libhmsbeagle/CPU/BeagleCPUSSEPlugin.cpp:9:
  /usr/lib/gcc/x86_64-linux-gnu/11/include/smmintrin.h: In member function ‘void beagle::cpu::BeagleCPU4StateSSEImpl<double, T_PAD, P_PAD>::calcCrossProductsPartials(const double*, const double*, const double*, const double*, double, double*, double*) [with int T_PAD = 2; int P_PAD = 0]’:
  /usr/lib/gcc/x86_64-linux-gnu/11/include/smmintrin.h:249:1: error: inlining failed in call to ‘always_inline’ ‘__m128d _mm_dp_pd(__m128d, __m128d, int)’: target specific option mismatch
    249 | _mm_dp_pd (__m128d __X, __m128d __Y, const int __M)
        | ^~~~~~~~~

Looking at workarounds for Homebrew in https://github.com/Homebrew/homebrew-core/pull/117835

msuchard commented 1 year ago

commit 2883ced7533ece1d30ec404939a6cb097d54daba adds an option to disable SSE build (-DBUILD_SSE=OFF), although this seems like a really bad idea. SSE4.1 was released >10 years ago.