Closed malaterre closed 2 years ago
Hi @malaterre,
HWY_CMAKE_ARM7:BOOL=ON imply that host system must support vfpv4 Currently hwy::SupportedTargets ony works on x86* arch
Yes, that's correct. I would be happy to integrate AT_HWCAP, but we have a compiler issue which must be resolved first. Clang still does not support pragma target for SVE and NEON, instead requiring -march flags. However, I recently learned that GCC does in fact support this now.
I believe you only require GCC, so we could enable runtime dispatch for you there. This will require setting HWY_TARGET_STR
and changing HWY_ATTAINABLE_TARGETS, plus checking AT_HWCAP (only on GCC+Linux).
We have a patch almost ready to go - can you help test?
We have a patch almost ready to go - can you help test?
sure ! where is it ?
Great, thanks :D
It just made it through our CI (which does not mean much, because this is effectively #if 0
in that environment), see above.
Second issue is on armhf:
/<<PKGBUILDDIR>>/hwy/ops/arm_sve-inl.h:19:10: fatal error: arm_sve.h: No such file or directory
19 | #include <arm_sve.h>
Do you know the function prototype you need from it ?
Here is the file list for arm64:
Here is the file list for armhf:
And the third one is on armel:
[2/90] /usr/bin/c++ -DHWY_SHARED_DEFINE -Dhwy_contrib_EXPORTS -I/<<PKGBUILDDIR>> -g -O2 -ffile-prefix-map=/<<PKGBUILDDIR>>=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -fvisibility=hidden -fvisibility-inlines-hidden -Wno-builtin-macro-redefined -D__DATE__=\"redacted\" -D__TIMESTAMP__=\"redacted\" -D__TIME__=\"redacted\" -fmerge-all-constants -Wall -Wextra -Wconversion -Wsign-conversion -Wvla -Wnon-virtual-dtor -fmath-errno -fno-exceptions -Werror -MD -MT CMakeFiles/hwy_contrib.dir/hwy/contrib/sort/vqsort_128a.cc.o -MF CMakeFiles/hwy_contrib.dir/hwy/contrib/sort/vqsort_128a.cc.o.d -o CMakeFiles/hwy_contrib.dir/hwy/contrib/sort/vqsort_128a.cc.o -c /<<PKGBUILDDIR>>/hwy/contrib/sort/vqsort_128a.cc
FAILED: CMakeFiles/hwy_contrib.dir/hwy/contrib/sort/vqsort_128a.cc.o
/usr/bin/c++ -DHWY_SHARED_DEFINE -Dhwy_contrib_EXPORTS -I/<<PKGBUILDDIR>> -g -O2 -ffile-prefix-map=/<<PKGBUILDDIR>>=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -fvisibility=hidden -fvisibility-inlines-hidden -Wno-builtin-macro-redefined -D__DATE__=\"redacted\" -D__TIMESTAMP__=\"redacted\" -D__TIME__=\"redacted\" -fmerge-all-constants -Wall -Wextra -Wconversion -Wsign-conversion -Wvla -Wnon-virtual-dtor -fmath-errno -fno-exceptions -Werror -MD -MT CMakeFiles/hwy_contrib.dir/hwy/contrib/sort/vqsort_128a.cc.o -MF CMakeFiles/hwy_contrib.dir/hwy/contrib/sort/vqsort_128a.cc.o.d -o CMakeFiles/hwy_contrib.dir/hwy/contrib/sort/vqsort_128a.cc.o -c /<<PKGBUILDDIR>>/hwy/contrib/sort/vqsort_128a.cc
In file included from /<<PKGBUILDDIR>>/hwy/ops/arm_neon-inl.h:22,
from /<<PKGBUILDDIR>>/hwy/highway.h:307,
from /<<PKGBUILDDIR>>/hwy/contrib/sort/traits128-inl.h:26,
from /<<PKGBUILDDIR>>/hwy/contrib/sort/vqsort_128a.cc:24,
from /<<PKGBUILDDIR>>/hwy/foreach_target.h:81,
from /<<PKGBUILDDIR>>/hwy/contrib/sort/vqsort_128a.cc:21:
/usr/lib/gcc/arm-linux-gnueabi/11/include/arm_neon.h:31:2: error: #error "NEON intrinsics not available with the soft-float ABI. Please use -mfloat-abi=softfp or -mfloat-abi=hard"
31 | #error "NEON intrinsics not available with the soft-float ABI. Please use -mfloat-abi=softfp or -mfloat-abi=hard"
| ^~~~~
build errors with armv7a-unknown-linux-gnueabihf with gcc-10.4.0, its latest git checkout plus the patch
/var/tmp/portage/portage/dev-cpp/highway-9999/work/highway-9999/hwy/targets.cc: In function ‘uint32_t hwy::{anonymous}::DetectTargets()’:
/var/tmp/portage/portage/dev-cpp/highway-9999/work/highway-9999/hwy/targets.cc:293:34: error: ‘HWCAP_ASIMD’ was not declared in this scope
293 | constexpr CapBits kGroupNEON = HWCAP_ASIMD | HWCAP_AES;
| ^~~~~~~~~~~
/var/tmp/portage/portage/dev-cpp/highway-9999/work/highway-9999/hwy/targets.cc:293:48: error: ‘HWCAP_AES’ was not declared in this scope; did you mean ‘HWCAP2_AES’?
293 | constexpr CapBits kGroupNEON = HWCAP_ASIMD | HWCAP_AES;
| ^~~~~~~~~
| HWCAP2_AES
/var/tmp/portage/portage/dev-cpp/highway-9999/work/highway-9999/hwy/targets.cc:294:33: error: ‘HWCAP_SVE’ was not declared in this scope; did you mean ‘HWCAP_SWP’?
294 | constexpr CapBits kGroupSVE = HWCAP_SVE;
| ^~~~~~~~~
| HWCAP_SWP
/var/tmp/portage/portage/dev-cpp/highway-9999/work/highway-9999/hwy/targets.cc:295:34: error: ‘HWCAP2_SVE2’ was not declared in this scope; did you mean ‘HWCAP2_SHA2’?
295 | constexpr CapBits kGroupSVE2 = HWCAP2_SVE2 | HWCAP2_SVEAES;
| ^~~~~~~~~~~
| HWCAP2_SHA2
/var/tmp/portage/portage/dev-cpp/highway-9999/work/highway-9999/hwy/targets.cc:295:48: error: ‘HWCAP2_SVEAES’ was not declared in this scope; did you mean ‘HWCAP2_AES’?
295 | constexpr CapBits kGroupSVE2 = HWCAP2_SVE2 | HWCAP2_SVEAES;
| ^~~~~~~~~~~~~
| HWCAP2_AES
full build log: build.log.zip
missing arm_sve.h
header also reproduced
hope the results are helpfull :)
Thank you both! It is difficult to develop without the target platform, I appreciate you testing.
For the HWY_ATTAINABLE issue, looks like the patch might not have been (fully) applied? This is indeed missing the HWY_ATTAINABLE macros added after 0.17.0: https://salsa.debian.org/debian-phototools-team/highway/-/blob/master/hwy/detect_targets.h
Ah, makes sense that arm_sve is missing on armv7/armhf. I've revised the code to avoid attemping to use it there.
I've also updated the HWCAP to only check for SVE+ flags on aarch64, and only check for the HWCAP if those macros are defined.
I believe this is now done, at least for GCC. Please feel free to reopen if there are any related issues.
I fail to understand how to compile highway on linux/armhf. Since highway claim to support dynamic dispatch, I assumed I could build highway on linux/armhf with
HWY_CMAKE_ARM7:BOOL=ON
and let the dynamic dispatch does it work at runtime.If I understand the build system mechanism:
HWY_CMAKE_ARM7:BOOL=ON
imply that host system must supportvfpv4
(I see some other neon extension eg a vmov.i32 in the binary) otherwise the test suite will fail,hwy::SupportedTargets
ony works on x86 arch (trivial to fix for linux) ()Is the above correct ?
Followup to issue #495
ref:
(*) https://salsa.debian.org/debian-phototools-team/highway/-/blob/master/debian/patches/neon.patch