google / highway

Performance-portable, length-agnostic SIMD with runtime dispatch
Apache License 2.0
3.95k stars 305 forks source link

Compilation fails on aarch64 platform #2234

Closed XiDianZuoYun closed 4 weeks ago

XiDianZuoYun commented 4 weeks ago

The error is as follows. It seems that my system architecture does not support fp16. Is it possible to turn off fp16 through compilation options?

In file included from /work_space/highway/hwy/highway.h:19, from /work_space/highway/hwy/per_target.cc:28, from /work_space/highway/hwy/foreach_target.h:163, from /work_space/highway/hwy/per_target.cc:27: /work_space/highway/hwy/base.h:117:28: error: invalid feature modifier bf16 of value ("+crypto+bf16+dotprod+fp16") in ‘target()’ pragma or attribute 117 | #define HWY_PRAGMA(tokens) _Pragma(#tokens) | ^~~~~~~ /work_space/highway/hwy/base.h:170:32: note: in expansion of macro ‘HWY_PRAGMA’ 170 | HWY_PRAGMA(GCC push_options) HWY_PRAGMA(GCC target targets_str) | ^~~~~~~~~~ /work_space/highway/hwy/ops/set_macros-inl.h:697:3: note: in expansion of macro ‘HWY_PUSH_ATTRIBUTES’ 697 | HWY_PUSH_ATTRIBUTES(HWY_TARGET_STR) \ | ^~~~~~~~~~~~~~~~~~~ /work_space/highway/hwy/ops/shared-inl.h:51:1: note: in expansion of macro ‘HWY_BEFORE_NAMESPACE’ 51 | HWY_BEFORE_NAMESPACE(); | ^~~~~~~~~~~~~~~~~~~~ /work_space/highway/hwy/base.h:117:28: error: pragma or attribute ‘target("+crypto+bf16+dotprod+fp16")’ is not valid 117 | #define HWY_PRAGMA(tokens) _Pragma(#tokens) | ^~~~~~~ /work_space/highway/hwy/base.h:170:32: note: in expansion of macro ‘HWY_PRAGMA’ 170 | HWY_PRAGMA(GCC push_options) HWY_PRAGMA(GCC target targets_str) | ^~~~~~~~~~ /work_space/highway/hwy/ops/set_macros-inl.h:697:3: note: in expansion of macro ‘HWY_PUSH_ATTRIBUTES’ 697 | HWY_PUSH_ATTRIBUTES(HWY_TARGET_STR) \ | ^~~~~~~~~~~~~~~~~~~ /work_space/highway/hwy/ops/shared-inl.h:51:1: note: in expansion of macro ‘HWY_BEFORE_NAMESPACE’ 51 | HWY_BEFORE_NAMESPACE(); | ^~~~~~~~~~~~~~~~~~~~ /work_space/highway/hwy/base.h:117:28: error: invalid feature modifier bf16 of value ("+crypto+bf16+dotprod+fp16") in ‘target()’ pragma or attribute 117 | #define HWY_PRAGMA(tokens) _Pragma(#tokens) | ^~~~~~~ /work_space/highway/hwy/base.h:170:32: note: in expansion of macro ‘HWY_PRAGMA’ 170 | HWY_PRAGMA(GCC push_options) HWY_PRAGMA(GCC target targets_str) | ^~~~~~~~~~ /work_space/highway/hwy/ops/set_macros-inl.h:697:3: note: in expansion of macro ‘HWY_PUSH_ATTRIBUTES’ 697 | HWY_PUSH_ATTRIBUTES(HWY_TARGET_STR) \ | ^~~~~~~~~~~~~~~~~~~ /work_space/highway/hwy/ops/arm_neon-inl.h:31:1: note: in expansion of macro ‘HWY_BEFORE_NAMESPACE’ 31 | HWY_BEFORE_NAMESPACE(); | ^~~~~~~~~~~~~~~~~~~~ /work_space/highway/hwy/base.h:117:28: error: pragma or attribute ‘target("+crypto+bf16+dotprod+fp16")’ is not valid 117 | #define HWY_PRAGMA(tokens) _Pragma(#tokens) | ^~~~~~~ /work_space/highway/hwy/base.h:170:32: note: in expansion of macro ‘HWY_PRAGMA’ 170 | HWY_PRAGMA(GCC push_options) HWY_PRAGMA(GCC target targets_str) | ^~~~~~~~~~ /work_space/highway/hwy/ops/set_macros-inl.h:697:3: note: in expansion of macro ‘HWY_PUSH_ATTRIBUTES’ 697 | HWY_PUSH_ATTRIBUTES(HWY_TARGET_STR) \ | ^~~~~~~~~~~~~~~~~~~ /work_space/highway/hwy/ops/arm_neon-inl.h:31:1: note: in expansion of macro ‘HWY_BEFORE_NAMESPACE’ 31 | HWY_BEFORE_NAMESPACE(); | ^~~~~~~~~~~~~~~~~~~~ In file included from /work_space/highway/hwy/highway.h:585, from /work_space/highway/hwy/per_target.cc:28, from /work_space/highway/hwy/foreach_target.h:174, from /work_space/highway/hwy/per_target.cc:27: /work_space/highway/hwy/ops/arm_neon-inl.h:4816:53: warning: ‘always_inline’ attribute ignored [-Wattributes] 4816 | static HWY_INLINE uint16x4_t BitCastFromRawNeonBF16(bfloat16x4_t raw) { | ^~~~~~~~~~~~ /work_space/highway/hwy/ops/arm_neon-inl.h:4816:53: error: ‘bfloat16x4_t’ was not declared in this scope; did you mean ‘float16x4_t’? 4816 | static HWY_INLINE uint16x4_t BitCastFromRawNeonBF16(bfloat16x4_t raw) { | ^~~~~~~~~~~~ | float16x4_t /work_space/highway/hwy/ops/arm_neon-inl.h:6948:19: error: ‘bfloat16x4_t’ does not name a type; did you mean ‘float16x4_t’? 6948 | static HWY_INLINE bfloat16x4_t BitCastToRawNeonBF16(uint16x4_t raw) { | ^~~~~~~~~~~~ | float16x4_t /work_space/highway/hwy/ops/arm_neon-inl.h:6951:19: error: ‘bfloat16x8_t’ does not name a type; did you mean ‘float16x8_t’? 6951 | static HWY_INLINE bfloat16x8_t BitCastToRawNeonBF16(uint16x8_t raw) { | ^~~~~~~~~~~~ | float16x8_t /work_space/highway/hwy/ops/arm_neon-inl.h: In function ‘hwy::N_NEON_BF16::Vec128<float, 4> hwy::N_NEON_BF16::MulEvenAdd(D, hwy::N_NEON_BF16::Vec128<hwy::bfloat16_t>, hwy::N_NEON_BF16::Vec128<hwy::bfloat16_t>, hwy::N_NEON_BF16::Vec128<float, 4>)’: /work_space/highway/hwy/ops/arm_neon-inl.h:6960:53: error: ‘BitCastToRawNeonBF16’ is not a member of ‘hwy::N_NEON_BF16::detail’; did you mean ‘BitCastFromRawNeonBF16’? 6960 | return Vec128<float>(vbfmlalbq_f32(c.raw, detail::BitCastToRawNeonBF16(a.raw), | ^~~~~~~~~~~~~~~~~~~~ | BitCastFromRawNeonBF16 /work_space/highway/hwy/ops/arm_neon-inl.h:6961:44: error: ‘BitCastToRawNeonBF16’ is not a member of ‘hwy::N_NEON_BF16::detail’; did you mean ‘BitCastFromRawNeonBF16’? 6961 | detail::BitCastToRawNeonBF16(b.raw))); | ^~~~~~~~~~~~~~~~~~~~ | BitCastFromRawNeonBF16 /work_space/highway/hwy/ops/arm_neon-inl.h:6960:24: error: there are no arguments to ‘vbfmlalbq_f32’ that depend on a template parameter, so a declaration of ‘vbfmlalbq_f32’ must be available [-fpermissive] 6960 | return Vec128<float>(vbfmlalbq_f32(c.raw, detail::BitCastToRawNeonBF16(a.raw), | ^~~~~~~~~~~~~ /work_space/highway/hwy/ops/arm_neon-inl.h:6960:24: note: (if you use ‘-fpermissive’, G++ will accept your code, but allowing the use of an undeclared name is deprecated) /work_space/highway/hwy/ops/arm_neon-inl.h: In function ‘hwy::N_NEON_BF16::Vec128<float, 4> hwy::N_NEON_BF16::MulOddAdd(D, hwy::N_NEON_BF16::Vec128<hwy::bfloat16_t>, hwy::N_NEON_BF16::Vec128<hwy::bfloat16_t>, hwy::N_NEON_BF16::Vec128<float, 4>)’: /work_space/highway/hwy/ops/arm_neon-inl.h:6967:53: error: ‘BitCastToRawNeonBF16’ is not a member of ‘hwy::N_NEON_BF16::detail’; did you mean ‘BitCastFromRawNeonBF16’? 6967 | return Vec128<float>(vbfmlaltq_f32(c.raw, detail::BitCastToRawNeonBF16(a.raw), | ^~~~~~~~~~~~~~~~~~~~ | BitCastFromRawNeonBF16 /work_space/highway/hwy/ops/arm_neon-inl.h:6968:44: error: ‘BitCastToRawNeonBF16’ is not a member of ‘hwy::N_NEON_BF16::detail’; did you mean ‘BitCastFromRawNeonBF16’? 6968 | detail::BitCastToRawNeonBF16(b.raw))); | ^~~~~~~~~~~~~~~~~~~~ | BitCastFromRawNeonBF16 /work_space/highway/hwy/ops/arm_neon-inl.h:6967:24: error: there are no arguments to ‘vbfmlaltq_f32’ that depend on a template parameter, so a declaration of ‘vbfmlaltq_f32’ must be available [-fpermissive] 6967 | return Vec128<float>(vbfmlaltq_f32(c.raw, detail::BitCastToRawNeonBF16(a.raw), | ^~~~~~~~~~~~~ /work_space/highway/hwy/ops/arm_neon-inl.h: In function ‘hwy::N_NEON_BF16::Vec128<float, 4> hwy::N_NEON_BF16::ReorderWidenMulAccumulate(D, hwy::N_NEON_BF16::Vec128<hwy::bfloat16_t>, hwy::N_NEON_BF16::Vec128<hwy::bfloat16_t>, hwy::N_NEON_BF16::Vec128<float, 4>, hwy::N_NEON_BF16::Vec128<float, 4>&)’: /work_space/highway/hwy/ops/arm_neon-inl.h:6977:44: error: ‘BitCastToRawNeonBF16’ is not a member of ‘hwy::N_NEON_BF16::detail’; did you mean ‘BitCastFromRawNeonBF16’? 6977 | detail::BitCastToRawNeonBF16(a.raw), | ^~~~~~~~~~~~~~~~~~~~ | BitCastFromRawNeonBF16 /work_space/highway/hwy/ops/arm_neon-inl.h:6978:44: error: ‘BitCastToRawNeonBF16’ is not a member of ‘hwy::N_NEON_BF16::detail’; did you mean ‘BitCastFromRawNeonBF16’? 6978 | detail::BitCastToRawNeonBF16(b.raw))); | ^~~~~~~~~~~~~~~~~~~~ | BitCastFromRawNeonBF16 /work_space/highway/hwy/ops/arm_neon-inl.h:6976:24: error: there are no arguments to ‘vbfdotq_f32’ that depend on a template parameter, so a declaration of ‘vbfdotq_f32’ must be available [-fpermissive] 6976 | return Vec128<float>(vbfdotq_f32(sum0.raw, | ^~~~~~~~~~~ /work_space/highway/hwy/ops/arm_neon-inl.h: In function ‘hwy::N_NEON_BF16::VFromD<D> hwy::N_NEON_BF16::ReorderWidenMulAccumulate(D, hwy::N_NEON_BF16::VFromD<typename D::Repartition<hwy::bfloat16_t> >, hwy::N_NEON_BF16::VFromD<typename D::Repartition<hwy::bfloat16_t> >, hwy::N_NEON_BF16::VFromD<D>, hwy::N_NEON_BF16::VFromD<D>&)’: /work_space/highway/hwy/ops/arm_neon-inl.h:7009:49: error: ‘BitCastToRawNeonBF16’ is not a member of ‘hwy::N_NEON_BF16::detail’; did you mean ‘BitCastFromRawNeonBF16’? 7009 | return VFromD<D>(vbfdot_f32(sum0.raw, detail::BitCastToRawNeonBF16(a.raw), | ^~~~~~~~~~~~~~~~~~~~ | BitCastFromRawNeonBF16 /work_space/highway/hwy/ops/arm_neon-inl.h:7010:39: error: ‘BitCastToRawNeonBF16’ is not a member of ‘hwy::N_NEON_BF16::detail’; did you mean ‘BitCastFromRawNeonBF16’? 7010 | detail::BitCastToRawNeonBF16(b.raw))); | ^~~~~~~~~~~~~~~~~~~~ | BitCastFromRawNeonBF16 /work_space/highway/hwy/ops/arm_neon-inl.h: In function ‘hwy::N_NEON_BF16::Vec128<float, 4> hwy::N_NEON_BF16::WidenMulPairwiseAdd(DF, hwy::N_NEON_BF16::Vec128<hwy::bfloat16_t>, hwy::N_NEON_BF16::Vec128<hwy::bfloat16_t>)’: /work_space/highway/hwy/ops/arm_neon-inl.h:7189:44: error: ‘BitCastToRawNeonBF16’ is not a member of ‘hwy::N_NEON_BF16::detail’; did you mean ‘BitCastFromRawNeonBF16’? 7189 | detail::BitCastToRawNeonBF16(a.raw), | ^~~~~~~~~~~~~~~~~~~~ | BitCastFromRawNeonBF16 /work_space/highway/hwy/ops/arm_neon-inl.h:7190:44: error: ‘BitCastToRawNeonBF16’ is not a member of ‘hwy::N_NEON_BF16::detail’; did you mean ‘BitCastFromRawNeonBF16’? 7190 | detail::BitCastToRawNeonBF16(b.raw))); | ^~~~~~~~~~~~~~~~~~~~ | BitCastFromRawNeonBF16 /work_space/highway/hwy/ops/arm_neon-inl.h: In function ‘hwy::N_NEON_BF16::VFromD<D> hwy::N_NEON_BF16::WidenMulPairwiseAdd(DF, hwy::N_NEON_BF16::VFromD<typename D::Repartition<hwy::bfloat16_t> >, hwy::N_NEON_BF16::VFromD<typename D::Repartition<hwy::bfloat16_t> >)’: /work_space/highway/hwy/ops/arm_neon-inl.h:7198:40: error: ‘BitCastToRawNeonBF16’ is not a member of ‘hwy::N_NEON_BF16::detail’; did you mean ‘BitCastFromRawNeonBF16’? 7198 | detail::BitCastToRawNeonBF16(a.raw), | ^~~~~~~~~~~~~~~~~~~~ | BitCastFromRawNeonBF16 /work_space/highway/hwy/ops/arm_neon-inl.h:7199:40: error: ‘BitCastToRawNeonBF16’ is not a member of ‘hwy::N_NEON_BF16::detail’; did you mean ‘BitCastFromRawNeonBF16’? 7199 | detail::BitCastToRawNeonBF16(b.raw))); | ^~~~~~~~~~~~~~~~~~~~ | BitCastFromRawNeonBF16 In file included from /work_space/highway/hwy/highway.h:19, from /work_space/highway/hwy/per_target.cc:28, from /work_space/highway/hwy/foreach_target.h:163, from /work_space/highway/hwy/per_target.cc:27: /work_space/highway/hwy/ops/generic_ops-inl.h: At global scope: /work_space/highway/hwy/base.h:117:28: error: invalid feature modifier bf16 of value ("+crypto+bf16+dotprod+fp16") in ‘target()’ pragma or attribute 117 | #define HWY_PRAGMA(tokens) _Pragma(#tokens) | ^~~~~~~ /work_space/highway/hwy/base.h:170:32: note: in expansion of macro ‘HWY_PRAGMA’ 170 | HWY_PRAGMA(GCC push_options) HWY_PRAGMA(GCC target targets_str) | ^~~~~~~~~~ /work_space/highway/hwy/ops/set_macros-inl.h:697:3: note: in expansion of macro ‘HWY_PUSH_ATTRIBUTES’ 697 | HWY_PUSH_ATTRIBUTES(HWY_TARGET_STR) \ | ^~~~~~~~~~~~~~~~~~~ /work_space/highway/hwy/ops/generic_ops-inl.h:34:1: note: in expansion of macro ‘HWY_BEFORE_NAMESPACE’ 34 | HWY_BEFORE_NAMESPACE(); | ^~~~~~~~~~~~~~~~~~~~ /work_space/highway/hwy/base.h:117:28: error: pragma or attribute ‘target("+crypto+bf16+dotprod+fp16")’ is not valid 117 | #define HWY_PRAGMA(tokens) _Pragma(#tokens) | ^~~~~~~ /work_space/highway/hwy/base.h:170:32: note: in expansion of macro ‘HWY_PRAGMA’ 170 | HWY_PRAGMA(GCC push_options) HWY_PRAGMA(GCC target targets_str) | ^~~~~~~~~~ /work_space/highway/hwy/ops/set_macros-inl.h:697:3: note: in expansion of macro ‘HWY_PUSH_ATTRIBUTES’ 697 | HWY_PUSH_ATTRIBUTES(HWY_TARGET_STR) \ | ^~~~~~~~~~~~~~~~~~~ /work_space/highway/hwy/ops/generic_ops-inl.h:34:1: note: in expansion of macro ‘HWY_BEFORE_NAMESPACE’ 34 | HWY_BEFORE_NAMESPACE(); | ^~~~~~~~~~~~~~~~~~~~ /work_space/highway/hwy/base.h:117:28: error: invalid feature modifier bf16 of value ("+crypto+bf16+dotprod+fp16") in ‘target()’ pragma or attribute 117 | #define HWY_PRAGMA(tokens) _Pragma(#tokens) | ^~~~~~~ /work_space/highway/hwy/base.h:170:32: note: in expansion of macro ‘HWY_PRAGMA’ 170 | HWY_PRAGMA(GCC push_options) HWY_PRAGMA(GCC target targets_str) | ^~~~~~~~~~ /work_space/highway/hwy/ops/set_macros-inl.h:697:3: note: in expansion of macro ‘HWY_PUSH_ATTRIBUTES’ 697 | HWY_PUSH_ATTRIBUTES(HWY_TARGET_STR) \ | ^~~~~~~~~~~~~~~~~~~ /work_space/highway/hwy/per_target.cc:30:1: note: in expansion of macro ‘HWY_BEFORE_NAMESPACE’ 30 | HWY_BEFORE_NAMESPACE(); | ^~~~~~~~~~~~~~~~~~~~ /work_space/highway/hwy/base.h:117:28: error: pragma or attribute ‘target("+crypto+bf16+dotprod+fp16")’ is not valid 117 | #define HWY_PRAGMA(tokens) _Pragma(#tokens) | ^~~~~~~ /work_space/highway/hwy/base.h:170:32: note: in expansion of macro ‘HWY_PRAGMA’ 170 | HWY_PRAGMA(GCC push_options) HWY_PRAGMA(GCC target targets_str) | ^~~~~~~~~~ /work_space/highway/hwy/ops/set_macros-inl.h:697:3: note: in expansion of macro ‘HWY_PUSH_ATTRIBUTES’ 697 | HWY_PUSH_ATTRIBUTES(HWY_TARGET_STR) \ | ^~~~~~~~~~~~~~~~~~~ /work_space/highway/hwy/per_target.cc:30:1: note: in expansion of macro ‘HWY_BEFORE_NAMESPACE’ 30 | HWY_BEFORE_NAMESPACE(); | ^~~~~~~~~~~~~~~~~~~~ make[2]: *** [CMakeFiles/hwy.dir/build.make:118: CMakeFiles/hwy.dir/hwy/per_target.cc.o] Error 1 make[1]: *** [CMakeFiles/Makefile2:1593: CMakeFiles/hwy.dir/all] Error 2 make: *** [Makefile:146: all] Error 2 My compiler version is: gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0

My system architecture is: Architecture: aarch64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 12 Vendor ID: ARM Model: 1 Model name: ARMv8 Processor rev 1 (v8l) Stepping: r0p1 Vulnerability Itlb multihit: Not affected Vulnerability L1tf: Not affected Vulnerability Mds: Not affected Vulnerability Meltdown: Not affected Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl Vulnerability Spectre v1: Mitigation; __user pointer sanitization Vulnerability Spectre v2: Not affected Vulnerability Srbds: Not affected Vulnerability Tsx async abort: Not affected Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp uscat ilrcpc flagm

jan-wassenberg commented 4 weeks ago

Hi, there are indeed various possible workarounds, including setting HWY_DISABLED_TARGETS=HWY_NEON_BF16.

I'm curious what the issue was? GCC 9.4 aarch64 seems to work. We don't enable HWY_NEON_BF16 prior to GCC 13.2.