ridiculousfish / libdivide

Official git repository for libdivide: optimized integer division
http://libdivide.com
Other
1.09k stars 77 forks source link

SSE2 is an (unnecessary?) build requirement #34

Closed norbertwenzel closed 6 years ago

norbertwenzel commented 6 years ago

I was testing libdivide on ARM where no SSE is available. Nevertheless I was still able to achieve a measurable speedup by using libdivide without any SIMD in my specific case.

I had to do the following changes to make the code compile:

All tests run fine afterwards.

My question is now if this SSE dependency is actually uneccessary and if there is any interest in a PR that enables libdivide to build on ARM devices (without any SIMD support)?

kimwalisch commented 6 years ago

My question is now if this SSE dependency is actually uneccessary and if there is any interest in a PR that enables libdivide to build on ARM devices (without any SIMD support)?

Yes, the SSE2 dependency is optional. By default even on x86 CPUs SSE2 is not enabled (if you simply include libdivide.h in your program). SSE is now actually already considered legacy, the newest vector instruction set for x86 CPUs beeing AVX512. I guess there are very few people out there that are still using the SSE2 libdivide feature, but it is kept for backwards compatibility.

The Makefile will work fine for most people as most developers have an x86 CPU. The problem that I see is if you want to add functionality to the Makefile to detect whether the CPU is an x86 CPU it will probably be using some king of dirty hack?! I already thought about using CMake instead the current Makefile where CPU detection should be much simpler.

What's your suggestion for fixing the build system on ARM?

norbertwenzel commented 6 years ago

What's your suggestion for fixing the build system on ARM?

You are right, I was thinking about some detection inside the Makefile to be minimally invasive. But since you brought it up I'd rather prefer CMake. I'd need that anyway for another project so I'd be willing to give it a try if you don't mind.

ridiculousfish commented 6 years ago

Yes let's allow building on ARM by default. The build system is up to whoever wants to put in the work :)

kimwalisch commented 6 years ago

But since you brought it up I'd rather prefer CMake.

Great choice :-) The good thing about using CMake instead of a plain Makefile is that we can also add support for Microsoft's Visual C++ compiler.

As a starting point you can re-use the CMakeLists.txt I wrote for my libpopcnt project.

Then you actually don't need to check the CPU architecture, instead you can check whether the compiler supports -msse2 on the current CPU architecture. If the compiler supports -msse2 then you add -msse2 -DLIBDIVIDE_USE_SSE2=1 to the compiler flags.

include(CheckCXXCompilerFlag)
include(CMakePushCheckState)

cmake_push_check_state()
set(CMAKE_REQUIRED_FLAGS -Werror)
check_cxx_compiler_flag(-msse2 msse2)
cmake_pop_check_state()

if(msse2)
    set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -msse2")
    add_definitions(-DLIBDIVIDE_USE_SSE2=1)
endif()

This is more portable than checking the CPU architecture because e.g. Microsoft's Visual C++ compiler does not support -msse2 on x86 CPUs. The other option is to google for a CMake module for detecting CPU architecture instruction sets (i.e. SSE, AVX, AVX2, NEON, ...). Personally I would try to keep the build system as simple as possible (only one CMakeLists.txt with no other modules), hence I favour the first option.

norbertwenzel commented 6 years ago

As a starting point you can re-use the CMakeLists.txt I wrote for my libpopcnt project.

@kimwalisch Thanks for your hint and sorry I did not read that earlier. Detecting SSE2 was the only thing I was still struggling with. I was thinking about CMakes try_compile() with the -msse2 option enabled, but since you already have a script that is working I'll be gladly looking into that. Thanks.

kimwalisch commented 6 years ago

But it does not detect SSE2 support for MSVC

We don't need that for now, it is just important that we don't use -msse when compiling using MSVC ;-)

kimwalisch commented 6 years ago

Fixed by switching build system to CMake, see CMakeLists.txt#L29.