Relying on the compiler intrinsic macro __AVX__ leads to compile errors when including the public heFFTe header in CUDA code of WarpX of the form:
/usr/lib64/gcc/x86_64-suse-linux/12/include/avx512fp16intrin.h(38): error: vector_size attribute requires an arithmetic or enum type
typedef __half __v8hf __attribute__ ((__vector_size__ (16)));
^
...
/usr/lib64/gcc/x86_64-suse-linux/12/include/avx512fp16intrin.h(62): error: more than one conversion function from "__half" to "<error-type>" applies:
function "__half::operator __half_raw() const" (declared at line 309 of /opt/nvidia/hpc_sdk/Linux_x86_64/23.9/cuda/12.2/include/cuda_fp16.hpp)
function "__half::operator float() const" (declared at line 337 of /opt/nvidia/hpc_sdk/Linux_x86_64/23.9/cuda/12.2/include/cuda_fp16.hpp)
function "__half::operator signed char() const" (declared at line 436 of /opt/nvidia/hpc_sdk/Linux_x86_64/23.9/cuda/12.2/include/cuda_fp16.hpp)
function "__half::operator unsigned char() const" (declared at line 443 of /opt/nvidia/hpc_sdk/Linux_x86_64/23.9/cuda/12.2/include/cuda_fp16.hpp)
function "__half::operator char() const" (declared at line 451 of /opt/nvidia/hpc_sdk/Linux_x86_64/23.9/cuda/12.2/include/cuda_fp16.hpp)
function "__half::operator short() const" (declared at line 476 of /opt/nvidia/hpc_sdk/Linux_x86_64/23.9/cuda/12.2/include/cuda_fp16.hpp)
function "__half::operator unsigned short() const" (declared at line 483 of /opt/nvidia/hpc_sdk/Linux_x86_64/23.9/cuda/12.2/include/cuda_fp16.hpp)
function "__half::operator int() const" (declared at line 490 of /opt/nvidia/hpc_sdk/Linux_x86_64/23.9/cuda/12.2/include/cuda_fp16.hpp)
function "__half::operator unsigned int() const" (declared at line 497 of /opt/nvidia/hpc_sdk/Linux_x86_64/23.9/cuda/12.2/include/cuda_fp16.hpp)
function "__half::operator long() const" (declared at line 504 of /opt/nvidia/hpc_sdk/Linux_x86_64/23.9/cuda/12.2/include/cuda_fp16.hpp)
function "__half::operator unsigned long() const" (declared at line 528 of /opt/nvidia/hpc_sdk/Linux_x86_64/23.9/cuda/12.2/include/cuda_fp16.hpp)
function "__half::operator long long() const" (declared at line 554 of /opt/nvidia/hpc_sdk/Linux_x86_64/23.9/cuda/12.2/include/cuda_fp16.hpp)
function "__half::operator unsigned long long() const" (declared at line 561 of /opt/nvidia/hpc_sdk/Linux_x86_64/23.9/cuda/12.2/include/cuda_fp16.hpp)
function "__half::operator bool() const" (declared at line 593 of /opt/nvidia/hpc_sdk/Linux_x86_64/23.9/cuda/12.2/include/cuda_fp16.hpp)
return __extension__ (__m128h)(__v8hf){ __A0, __A1, __A2, __A3,
^
...
Since heFFTe has explicit control for AVX via macros, we can rely on those. Most GPU users will disable AVX support anyway when building for GPU.
Relying on the compiler intrinsic macro
__AVX__
leads to compile errors when including the public heFFTe header in CUDA code of WarpX of the form:Since heFFTe has explicit control for AVX via macros, we can rely on those. Most GPU users will disable AVX support anyway when building for GPU.
System details
Perlmutter (NERSC) HPE machine with:
Kudos
Thanks to @aeriforme and @Haavaan for reporting this. First seen in https://github.com/ECP-WarpX/WarpX/pull/4937