Closed manodeep closed 11 months ago
I've tried making that line both explicitly true and explicitly false. If I set it to '#if 0' I get a different error:
../../utils/cpu_features.h:49:11: error: expected ‘(’ before ‘{’ token
49 | __asm {
| ^
| (
../../utils/cpu_features.h:50:9: error: unknown type name ‘mov’
50 | mov eax, functionnumber
| ^~~
../../utils/cpu_features.h:51:9: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘xor’
51 | xor ecx, ecx
| ^~~
../../utils/cpu_features.h:53:9: error: unknown type name ‘mov’
53 | mov esi, output
| ^~~
../../utils/cpu_features.h:54:9: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘mov’
54 | mov [esi], eax
| ^~~
../../utils/cpu_features.h:53:13: warning: unused variable ‘esi’ [-Wunused-variable]
53 | mov esi, output
| ^~~
../../utils/cpu_features.h:50:13: warning: unused variable ‘eax’ [-Wunused-variable]
50 | mov eax, functionnumber
| ^~~
../../utils/cpu_features.h:64:23: error: invalid storage class for function ‘xgetbv’
64 | static inline int64_t xgetbv (int ctr) {
| ^~~~~~
../../utils/cpu_features.h:86:12: warning: nested extern declaration of ‘runtime_instrset_detect’ [-Wnested-externs]
86 | extern int runtime_instrset_detect(void);
| ^~~~~~~~~~~~~~~~~~~~~~~
../../utils/cpu_features.h:87:12: warning: nested extern declaration of ‘get_max_usable_isa’ [-Wnested-externs]
87 | extern int get_max_usable_isa(void);
Is it just wrong C syntax that has not been seen due to not being invoked in many years?
@karlglazebrook Huh - now this is erroring in the next function. I wonder if the compilation works for only these lines?
It is also erroring at line 49 which is part of the first #ifdef #else branch. Yes it is also erroring in the next lot but one thing at a time?
For lines 38..60 The #else branch it seems to me is clearly intel only from the comments, however the first branch gives the error initially reported. It says it is ' inline assembly, Gnu/AT&T syntax'. I am guessing the problem here is that neither type of assembly is correct for ARM.
Ahh yes - thanks for clarifying! I got thrown off by the line numbers. You are quite right - the assembly syntax is different for ARM. Let me try to come up with a solution...
Documenting what I have found so far. One solution that works for clang
, is to add the following line everywhere:
#if defined (__ARM_NEON__)
return 0/FALLBACK /* only compiles the "FALLBACK" kernels */
#elif ...
Looks like these registers might have the appropriate values. Relevant SO entry. ARM docs for writing inline assembly
@karlglazebrook Do you mind copy-pasting the output of:
#!/bin/bash
declare -a compilers=("/usr/bin/clang" "/usr/local/bin/gcc -fopenmp")
for cc in "${compilers[@]}"
do
echo "*** $cc ***"
$cc -std=c99 -march=native -O3 -dM -E - < /dev/null
echo "*** $cc done ***"
done
This will give a hint as to what compiler flags are being defined for the OS + instruction set.
Sure, noting I had to change the line to
eval "$cc -std=c99 -march=native -O3 -dM -E - < /dev/null"
otherwise I got the error
test.sh:6: no such file or directory: /usr/local/bin/gcc -fopenmp
1) as is
/usr/bin/clang clang: error: the clang compiler does not support '-march=native' /usr/bin/clang done /usr/local/bin/gcc -fopenmp
/usr/local/bin/gcc -fopenmp done
2) removing -march=native :
/usr/bin/clang
/usr/bin/clang done /usr/local/bin/gcc -fopenmp
/usr/local/bin/gcc -fopenmp done
On 16 Jan 2021, at 9:32 am, Manodeep Sinha notifications@github.com wrote:
@karlglazebrook https://github.com/karlglazebrook Do you mind copy-pasting the output of:
!/bin/bash
declare -a compilers=("/usr/bin/clang" "/usr/local/bin/gcc -fopenmp") for cc in "${compilers[@]}" do echo " $cc " $cc -std=c99 -march=native -O3 -dM -E - < /dev/null echo " $cc done " done This will give a hint as to what compiler flags are being defined for the OS + instruction set.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/manodeep/Corrfunc/issues/241#issuecomment-761232174, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADU7FGW2FPXPM5QZ4GEOYFTS2C7BPANCNFSM4WDRL27Q.
This is great - thanks @karlglazebrook!
Looks like __aarch64__
, __arm64
, and __arm64__
are all defined (equal to 1) in these three compiler flags. For the ISA, looks like __ARM_NEON (=1)
is defined for gcc (with or without -march=native
) while __ARM_NEON (=1)
, __ARM_NEON__ (=1)
, __ARM_NEON_FP (=0xE)
So the platform can be detected with any of __aarch64__
, __aarch64__
,__aarch64__
(or to be on the safe side, an ||
between all three) and then returning FALLBACK
ISA before running the cpuid
call.
If we add any NEON
kernels in the future, then those will have to be protected by the #ifdef __ARM_NEON
conditions, and the corresponding cpuid
check will have to updated to the actual assembly call necessary. (The compile time check is necessary but the runtime cpu may be different)
@lgarrison What do you think?
I will also note that there might be "undocumented" vectorised calls - someone dug these instructions out. Here's my fork of their secret gist. Found the gist through here
That's if any of __aarch64__
, __arm64
, and __arm64__
are detected? Sounds good to me!
Would be a fun project to try to get those undocumented vector calls to work!
Here are the (untested) updates to the cpu_features.[ch]
files.
@lgarrison Will you please see if the updates and returns make sense?
The updates from the previous comment do help to get the code built and installed. I wonder if a bit more systematic fix will be possible?
@misharash You might be interested in the initial implementation for the M1 architecture within the arm64
branch on this repo
Right, the arm64
branch (PR here) works on the M1 with both fallback and NEON kernels. So you can just use that branch instead of patching your source with the previous code.
@manodeep can confirm, but the fallback kernel from that branch should definitely be safe to use. It sounds like the NEON kernel is working too, but is less tested.
Thank you! Initially it wasn't clear that the NEON pull request was related to Apple Silicon support.
Solved by #295. Now Corrfunc master
branch should compile and run fine on Apple laptops with M1/M2 cpus
Sidenote: Optimised kernels being (slowly) implemented under the arm64
branch.
Is your feature request related to a problem? Please describe. Ability to compile and run Corrfunc on the (late-2020) Apple M1 laptops. Does not compile
Describe the solution you'd like Run Corrfunc on the new M1 laptops (preferably optimised kernels - which needs kernels with
Neon
ISA)Describe alternatives you've considered N/A
Additional context The new M1 chip supports ARMv8 (128 bits) instruction set. Most of the codebase was written assuming
__DARWIN__
impliesx86_64
but that is no longer the case (unsure if the new platform defines__aarch64__
,__arm64__
or both)@karlglazebrook has very kindly debugged the install from a source tarball (
pip install
failed). The current error occurs when compilingcpu_features.c
One solution could be to protect this line :
to something like