Open VictorEijkhout opened 3 months ago
Is this on a Sapphire Rapids machine by any chance? I wonder if the compiler is generating AVX512-FP16 instructions that the assembler doesn't know what to do with.
In any case, yes, apologies for the -march=native
difficulties. We're thinking about/working on a version with true multiple SIMD dispatch capabilities, and in the meantime, we've opted to use -march=native
by default and document that we're doing so.
You're right, I'm compiling on a Sapphire Rapids. (Stampede3 at TACC in case you're wondering: https://docs.tacc.utexas.edu/hpc/stampede3/#system-login )
Using -march=native
(and consequently disabling cross-compiling by default) is an unfortunate side-effect of making the typical user-experience smooth.
One option could be to add an "ISA_CFLAGS
" variable (within common.mk
that inherits from the env). System admins/cross-compilers can tweak the ISA setting by setting that ISA_CFLAGS
on the command-line and then compiling. (And if the ISA_CFLAGS
is undefined, then the usual -march=native
/-mcpu=apple-m1
/-xhost
etc are used, as relevant)
What ISA(s) would you be compiling for?
At the moment I have a bunch of Intels, AMD, and Grace/Grace.
Why do you do everything through the common.mk
? I'm surprised to find a package these days that doesn't use at least autotools or preferably cmake.
What are the ISAs for the Intels - Corrfunc uses at most the AVX512F + VL intrinsics? Separately, are the Grace ones AARCH64?
Regarding common.mk
- it's mostly historical really. Back in 2012 when I started writing the Corrfunc predecessor, I didn't know autotools (and I still don't) and so went with the common.mk
solution. Over the years, we have added a lot of checks into common.mk
and now it is not worth the effort to migrate to a different solution.
If you are comfortable with autotools
and would be willing/able to contribute a patch, that would be quite welcome!
By the way, I just encountered the same issue on my local cluster at American University. Removing -march=native
solved the problem.
@johannesulf If you remove the -march=native
, then you will only have the fallback
kernel and lose the performance boost from the SIMD kernels. Might be best to at least add the minimum ISA, say, -mavx2
(ideally the ISA of the runtime CPU) in place of the -march=native
flag.
This goes away if I remove the
arch=native
flag. (Totally bizarre. How can you generate instructions that don't exist if you compile for native arch?)Using the
native
flag makes it impossible to compile for supercomputer clusters where the build node is often different from the compute node.