f4exb / cm256cc

Fast GF(256) Cauchy MDS Block Erasure Codec in C++
GNU General Public License v3.0
23 stars 8 forks source link

cmake/clang fixes and improvements #9

Closed ra1nb0w closed 5 years ago

ra1nb0w commented 5 years ago

The main change is about abstract architecture and cpu flags detection. Now is not dependant on OS shell commands but only on c/c++ code.

Tested on different hardwares with:

In the future, I will send the patch to sdrangel too.

Maybe, this change require a new tagged version, like 1.0.5.

ra1nb0w commented 5 years ago

Probably, we need a way to set as options which kind of cpu flags we want to use and avoid discovery. Can be SSSE3 as default, to be conservative, and add -DNATIVE=on do use cpu flags discovery. What do you think?

f4exb commented 5 years ago

Issue on armv8 (ARM64). Cmake breaks with unsupported architecture:

-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Target architecture: ARM64
-- Unsupported architecture - Terminated

Edit: this is on Rpi3B with Arch for armv8 and GCC 8.2.1. It is strange since the ARM64 architecture appears to be correctly detected. Edit2: it is the neon test that breaks. Edit3: the error:

opt/build/cm256cc/build/CMakeFiles/CMakeTmp/CheckSymbolExists.c: In function 'main':

/opt/build/cm256cc/build/CMakeFiles/CMakeTmp/CheckSymbolExists.c:7:19: error: '__arm__' undeclared (first use in this function); did you mean '__DATE__'?
   return ((int*)(&__arm__))[argc];
                   ^~~~~~~
                   __DATE__
/opt/build/cm256cc/build/CMakeFiles/CMakeTmp/CheckSymbolExists.c:7:19: note: each undeclared identifier is reported only once for each function it appears in
make[1]: *** [CMakeFiles/cmTC_bce09.dir/build.make:66: CMakeFiles/cmTC_bce09.dir/CheckSymbolExists.c.o] Error 1
ra1nb0w commented 5 years ago

I tested the code with raspberry pi3+ (architecture ARM - arm7l) and raspbian and it works. Which architecture and OS are you using? Have you the file arm_neon.h?

f4exb commented 5 years ago

I just stated it: Arch Linux arm 64 bit. Edit: here: https://archlinuxarm.org/platforms/armv8/broadcom/raspberry-pi-3

ra1nb0w commented 5 years ago

that error is CheckSymbolExists related and probably follow detect_architecture("__arm__" ARM) and it is correct because It is not the right architecture. The next text detect_architecture("__aarch64__" ARM64) run correctly and set the architecture. As you said correctly the problem is if(COMPILE_NEON AND RUN_NEON EQUAL 0) that is false. Now I search an micro-sd to install archlinux and verify why it fails.

f4exb commented 5 years ago

OK, I was mislead. The other architectures are reported as errors of course. BTW arm_neon.h has to be there since it is included in sse2neon.h and would not compile with the current version otherwise.

ra1nb0w commented 5 years ago

fixed Advanced SIMD (aka NEON) is mandatory for AArch64

f4exb commented 5 years ago

Yes! looks like it is working:

-- Target architecture: ARM64
-- Use NEON SIMD instructions
-- Architecture supports Neon - OK
ra1nb0w commented 5 years ago

about this: Probably, we need a way to set as options which kind of cpu flags we want to use and avoid discovery. Can be SSSE3 as default, to be conservative, and add -DNATIVE=on do use cpu flags discovery. What do you think?

f4exb commented 5 years ago

What is the issue with discovery? Maybe falling back to SSSE3 should be the option and discovery the default (the other way round).

ra1nb0w commented 5 years ago

ops. I did not explain the problem: distribution. If I want to build a package that will be runnable by many users I need to use the minimum cpu flags and probably alert the user at runtime that he doesn't support that flags; in this case SSSE3. Yes, setting discovery as default and use a directive for SSSE3 is valid too.

f4exb commented 5 years ago

I see. Most of the time this code is meant to be compiled on the machine it runs this is why I am suggesting discovery by default so you don't have to worry about the flags available on your hardware. On the other hand when building for distribution it may indeed be good to fall back to a mild option like SSSE3. If you build for distribution you know you are doing so and you are normally aware of this issue.

So it is fine with me if SSSE3 is optional and discovery the default.