Open cho-m opened 1 year ago
Snippet of cmake --build
output where -march=native
is being passed to every compilation:
[ 5%] Building CXX object libhmsbeagle/CMakeFiles/hmsbeagle.dir/beagle.cpp.o
cd /tmp/beagle-20221202-16102-yndsrr/beagle-lib-4.0.0/build/libhmsbeagle && /home/linuxbrew/.linuxbrew/Homebrew/Library/Homebrew/shims/linux/super/g++-11 -DPACKAGE_BUGREPORT=\"beagle-dev@googlegroups.com\" -DPACKAGE_NAME=\"libhmsbeagle\" -DPACKAGE_STRING="\"libhmsbeagle 4.0.0\"" -DPACKAGE_TARNAME=\"libhmsbeagle\" -DPACKAGE_URL=\"\" -DPACKAGE_VERSION=\"4.0.0\" -DPLUGIN_VERSION=\"40\" -Dhmsbeagle_EXPORTS -I/tmp/beagle-20221202-16102-yndsrr/beagle-lib-4.0.0 -I/home/linuxbrew/.linuxbrew/opt/openjdk@11/libexec/include -std=c++11 -O3 -pthread -march=native -O2 -g -DNDEBUG -fPIC -std=gnu++11 -MD -MT libhmsbeagle/CMakeFiles/hmsbeagle.dir/beagle.cpp.o -MF CMakeFiles/hmsbeagle.dir/beagle.cpp.o.d -o CMakeFiles/hmsbeagle.dir/beagle.cpp.o -c /tmp/beagle-20221202-16102-yndsrr/beagle-lib-4.0.0/libhmsbeagle/beagle.cpp
[ 11%] Building CXX object libhmsbeagle/CPU/CMakeFiles/hmsbeagle-cpu.dir/BeagleCPUPlugin.cpp.o
cd /tmp/beagle-20221202-16102-yndsrr/beagle-lib-4.0.0/build/libhmsbeagle/CPU && /home/linuxbrew/.linuxbrew/Homebrew/Library/Homebrew/shims/linux/super/g++-11 -DPACKAGE_BUGREPORT=\"beagle-dev@googlegroups.com\" -DPACKAGE_NAME=\"libhmsbeagle\" -DPACKAGE_STRING="\"libhmsbeagle 4.0.0\"" -DPACKAGE_TARNAME=\"libhmsbeagle\" -DPACKAGE_URL=\"\" -DPACKAGE_VERSION=\"4.0.0\" -DPLUGIN_VERSION=\"40\" -Dhmsbeagle_cpu_EXPORTS -I/tmp/beagle-20221202-16102-yndsrr/beagle-lib-4.0.0 -I/home/linuxbrew/.linuxbrew/opt/openjdk@11/libexec/include -std=c++11 -O3 -pthread -march=native -O2 -g -DNDEBUG -fPIC -std=gnu++11 -MD -MT libhmsbeagle/CPU/CMakeFiles/hmsbeagle-cpu.dir/BeagleCPUPlugin.cpp.o -MF CMakeFiles/hmsbeagle-cpu.dir/BeagleCPUPlugin.cpp.o.d -o CMakeFiles/hmsbeagle-cpu.dir/BeagleCPUPlugin.cpp.o -c /tmp/beagle-20221202-16102-yndsrr/beagle-lib-4.0.0/libhmsbeagle/CPU/BeagleCPUPlugin.cpp
[ 17%] Building CXX object libhmsbeagle/CMakeFiles/hmsbeagle.dir/benchmark/BeagleBenchmark.cpp.o
cd /tmp/beagle-20221202-16102-yndsrr/beagle-lib-4.0.0/build/libhmsbeagle && /home/linuxbrew/.linuxbrew/Homebrew/Library/Homebrew/shims/linux/super/g++-11 -DPACKAGE_BUGREPORT=\"beagle-dev@googlegroups.com\" -DPACKAGE_NAME=\"libhmsbeagle\" -DPACKAGE_STRING="\"libhmsbeagle 4.0.0\"" -DPACKAGE_TARNAME=\"libhmsbeagle\" -DPACKAGE_URL=\"\" -DPACKAGE_VERSION=\"4.0.0\" -DPLUGIN_VERSION=\"40\" -Dhmsbeagle_EXPORTS -I/tmp/beagle-20221202-16102-yndsrr/beagle-lib-4.0.0 -I/home/linuxbrew/.linuxbrew/opt/openjdk@11/libexec/include -std=c++11 -O3 -pthread -march=native -O2 -g -DNDEBUG -fPIC -std=gnu++11 -MD -MT libhmsbeagle/CMakeFiles/hmsbeagle.dir/benchmark/BeagleBenchmark.cpp.o -MF CMakeFiles/hmsbeagle.dir/benchmark/BeagleBenchmark.cpp.o.d -o CMakeFiles/hmsbeagle.dir/benchmark/BeagleBenchmark.cpp.o -c /tmp/beagle-20221202-16102-yndsrr/beagle-lib-4.0.0/libhmsbeagle/benchmark/BeagleBenchmark.cpp
Error when building without -march=native
:
In file included from /tmp/beagle-20221210-16102-8nngbe/beagle-lib-4.0.0/libhmsbeagle/CPU/SSEDefinitions.h:66,
from /tmp/beagle-20221210-16102-8nngbe/beagle-lib-4.0.0/libhmsbeagle/CPU/BeagleCPU4StateSSEImpl.hpp:34,
from /tmp/beagle-20221210-16102-8nngbe/beagle-lib-4.0.0/libhmsbeagle/CPU/BeagleCPU4StateSSEImpl.h:338,
from /tmp/beagle-20221210-16102-8nngbe/beagle-lib-4.0.0/libhmsbeagle/CPU/BeagleCPUSSEPlugin.cpp:9:
/usr/lib/gcc/x86_64-linux-gnu/11/include/smmintrin.h: In member function ‘void beagle::cpu::BeagleCPU4StateSSEImpl<double, T_PAD, P_PAD>::calcCrossProductsPartials(const double*, const double*, const double*, const double*, double, double*, double*) [with int T_PAD = 2; int P_PAD = 0]’:
/usr/lib/gcc/x86_64-linux-gnu/11/include/smmintrin.h:249:1: error: inlining failed in call to ‘always_inline’ ‘__m128d _mm_dp_pd(__m128d, __m128d, int)’: target specific option mismatch
249 | _mm_dp_pd (__m128d __X, __m128d __Y, const int __M)
| ^~~~~~~~~
Looking at workarounds for Homebrew in https://github.com/Homebrew/homebrew-core/pull/117835
commit 2883ced7533ece1d30ec404939a6cb097d54daba adds an option to disable SSE build (-DBUILD_SSE=OFF
), although this seems like a really bad idea. SSE4.1 was released >10 years ago.
Currently, BEAGLE will always build with
-march=native
when the compiler supports it:https://github.com/beagle-dev/beagle-lib/blob/2af91163d48bed8edfbf64af46d5877305546fd1/CMakeLists.txt#L14
https://github.com/beagle-dev/beagle-lib/blob/2af91163d48bed8edfbf64af46d5877305546fd1/CMakeLists.txt#L119-L123
This makes it trickier to distribute binaries as users' systems may need to minimally match SIMD support of host.
In Homebrew, this may be the reason why we are occasionally hitting segfaults when testing packages that use BEAGLE (e.g. BEAST and MrBayes).
When experimenting with disabling
BEAGLE_OPTIMIZE_FOR_NATIVE_ARCH
, I saw the compilation would fail on_mm_dp_pd
: https://github.com/beagle-dev/beagle-lib/blob/2af91163d48bed8edfbf64af46d5877305546fd1/libhmsbeagle/CPU/BeagleCPU4StateSSEImpl.hpp#L62This appears to be a SSE4.1 intrinsic. So, when I removed
-march=native
, build was missing-msse4.1
needed to use the intrinsic.In older BEAGLE releases, there was support for disabling SSE2: https://github.com/beagle-dev/beagle-lib/blob/35bbe781fd0faa265d94926264000118c234c772/configure.ac#L315
It would be nice to allow compiling without SSE4.1.
Specifically, in Homebrew, we want to build Linux binaries for Core2 (up to SSSE3) and macOS binaries for Nehalem (up to SSE4.2).