jeffdaily / parasail

Pairwise Sequence Alignment Library
Other
243 stars 34 forks source link

Cross-compiling for arm64 on macOS (x86_64 host) #90

Closed rainbowgoblin closed 2 years ago

rainbowgoblin commented 2 years ago

I'm trying to build parasail as a universal binary for arm64/x86_64 macOS on an Intel Mac host system. The normal way to do this is to run cmake with -DCMAKE_OSX_ARCHITECTURES="x86_64;arm64". When I do this, cmake runs with no errors, but make produces the following:

/tmp/parasail-2.4.2/src/cpuid.c:31:25: error: invalid output constraint '+b' in asm
  __asm__ ( "cpuid" : "+b" (ebx),

I get the same error when I try to build for arm64 alone on my x86_64 host, so this seems to be a problem with cross-compiling for arm64 on macOS -- this isn't specifically a problem building a universal binary.

Interestingly, I am able to cross-compile x86_64 binaries on an M1 (arm64) Mac with -DCMAKE_OSX_ARCHITECTURES="x86_64;arm64"

You have quite detailed instructions for cross-compiling for arm64 on Linux, but nothing yet for macOS. Is cross-compiling supported on macOS?

I'm using clang compilers from Xcode 13.1 and cmake 3.21.2.

jeffdaily commented 2 years ago

I do not have access to any mac systems, but we might still be able to work through this. The CMakeLists.txt has a section where it tests for the type of CPU.

https://github.com/jeffdaily/parasail/blob/e236a6feec7bc84fb81923d783acb345e538a281/CMakeLists.txt#L48-L55

If CMake correctly detects that you are cross-compiling for arm, then the source file that is causing us grief (src/cpuid.c) should be replaced with src/cpuid_arm.c instead. My README doesn't explain how to cross-compile with cmake because I've never done it myself.

I don't think I anticipated cross-compiling on an intel mac host. I think the cmake snippet above just checks the current host's CPU type unless you have specified a cmake toolchain file. Checking the host's CPU type was how I intended the test.

As a workaround you might be able to hack at the CMakeLists.txt file and just hard-code the IS_ARM_ISA value to TRUE for your arm64 build.

rainbowgoblin commented 2 years ago

Thanks, that was helpful. I've replaced that first if with:

IF( APPLE )
    IF( ${CMAKE_OSX_ARCHITECTURES} MATCHES "x86_64" )
        SET( IS_ARM_ISA FALSE )
        MESSAGE( STATUS "Check if the building for macOS ARM - FALSE" )
    ELSEIF( ${CMAKE_OSX_ARCHITECTURES} MATCHES "arm64" OR ${CMAKE_SYSTEM_PROCESSOR} MATCHES "aarch64.*" )
        SET( IS_ARM_ISA TRUE )
        MESSAGE( STATUS "Check if the building for macOS ARM - TRUE" )
    ELSE( )
        SET( IS_ARM_ISA FALSE )
        MESSAGE( STATUS "Check if building for macOS ARM (uncertain) - FALSE" )
    ENDIF( )
ELSEIF( ${CMAKE_SYSTEM_PROCESSOR} MATCHES "arm.*" OR ${CMAKE_SYSTEM_PROCESSOR} MATCHES "aarch64.*" )

This is fine for my purpose: I can build both x86_64 and arm64 versions, then stick them together using the lipo tool.

My original approach, using -DCMAKE_OSX_ARCHITECTURES="x86_64;arm64", is supposed to build universal binaries for both architectures in one shot. When building with clang the result is that the flags -arch x86_64 -arch arm64 are passed to the compiler together. I don't think you can do this if you're detecting whether to build for arm64 when running cmake, you have to do it at compile time. I think you can make this work by combining your two cpuid.c files into one and wrapping them in an ifdef/else, i.e.

#ifdef __aarch64__ 
[cpuid_arm.c content] 
#else 
[cpuid.c content]
#endif

I tested this quickly and it appears to work on my Intel Mac (I didn't test it very thoroughly, mind you). I haven't checked Linux at all, but I suspect this would work there as well. I'll fork your project and investigate this out properly.

jeffdaily commented 2 years ago

Thank you for the helpful discussion here. I believe I have it fixed now if you're willing to use the develop branch. I'm not sure if I'm getting any sort of vectorized support on osx now with the arm64 target. The x86_64 target will continue to use sse or avx.

This project has been woefully unmaintained. I'm in the process of migrating the old travis ci scripts to github actions. And there is a long-overdue release needed to announce the support of a position-specific scoring matrix.

rainbowgoblin commented 2 years ago

Cool, I've successfully built universal binaries from the develop branch on both Intel and M1 Macs.

I'll have to check with my team if we're ok with the develop branch. If not, the CMakeLists.txt hack will work for us.

I feel your neglected open source repo pain (I'm a former academic with zero to maintain my own open source code since moving to the private sector).