archspec / archspec-json

Other
20 stars 33 forks source link

Improve a64fx microarchitecture #44

Closed giordano closed 2 years ago

giordano commented 2 years ago

Fix #23.

giordano commented 2 years ago

Similar to https://github.com/archspec/archspec-json/pull/43#issuecomment-1097001013, I pushed a commit to keep only -mcpu=a64fx where available

alalazo commented 2 years ago

Two questions:

  1. Would you be able to update the armclang flags too? (Pinging @OliverPerks in case he has the information we need at hand)
  2. Are we sure that -mcpu=a64fx doesn't turn on aes instructions on GCC?
giordano commented 2 years ago
  1. Would you be able to update the armclang flags too?

Unfortunately I don't think I have access to armclang, I can't help with that

  1. Are we sure that -mcpu=a64fx doesn't turn on aes instructions on GCC?

I don't how to check it easily with GCC, but I'm not sure anymore LLVM doesn't enable AES:

sandbox:${WORKSPACE} # echo 'int main()'|clang -x c - -mcpu=a64fx -###
clang version 12.0.0 (/home/mose/.julia/dev/BinaryBuilderBase/deps/downloads/llvm-project.git d28af7c654d8db0b68c175db5ce212d74fb5e9bc)
Target: arm64-unknown-linux-gnu
Thread model: posix
InstalledDir: /opt/x86_64-linux-musl/bin
 "/opt/x86_64-linux-musl/bin/clang-12" "-cc1" "-triple" "arm64-unknown-linux-gnu" "-emit-obj" "-mrelax-all" "--mrelax-relocations" "-disable-free" "-disable-llvm-verifier" "-discard-value-names" "-main-file-name" "-" "-mrelocation-model" "static" "-mframe-pointer=non-leaf" "-fmath-errno" "-fno-rounding-math" "-mconstructor-aliases" "-munwind-tables" "-target-cpu" "a64fx" "-target-feature" "+v8.2a" "-target-feature" "+fp-armv8" "-target-feature" "+neon" "-target-feature" "+crc" "-target-feature" "+crypto" "-target-feature" "+fullfp16" "-target-feature" "+ras" "-target-feature" "+lse" "-target-feature" "+rdm" "-target-feature" "+sve" "-target-feature" "+sha2" "-target-feature" "+aes" "-target-abi" "aapcs" "-fallow-half-arguments-and-returns" "-fno-split-dwarf-inlining" "-debugger-tuning=gdb" "-resource-dir" "/opt/x86_64-linux-musl/lib/clang/12.0.0" "-isysroot" "/opt/aarch64-linux-gnu/aarch64-linux-gnu/sys-root" "-internal-isystem" "/opt/aarch64-linux-gnu/aarch64-linux-gnu/sys-root/usr/local/include" "-internal-isystem" "/opt/x86_64-linux-musl/lib/clang/12.0.0/include" "-internal-externc-isystem" "/opt/aarch64-linux-gnu/aarch64-linux-gnu/sys-root/include" "-internal-externc-isystem" "/opt/aarch64-linux-gnu/aarch64-linux-gnu/sys-root/usr/include" "-fdebug-compilation-dir" "/workspace" "-ferror-limit" "19" "-fno-signed-char" "-fgnuc-version=4.2.1" "-fcolor-diagnostics" "-faddrsig" "-o" "/tmp/--f56940.o" "-x" "c" "-"
 "/opt/bin/aarch64-linux-gnu-libgfortran5-cxx11/ld.aarch64-linux-gnu" "--sysroot=/opt/aarch64-linux-gnu/aarch64-linux-gnu/sys-root" "-EL" "-z" "now" "-z" "relro" "--hash-style=gnu" "--eh-frame-hdr" "-m" "aarch64linux" "-dynamic-linker" "/lib/ld-linux-aarch64.so.1" "-o" "a.out" "/opt/aarch64-linux-gnu/aarch64-linux-gnu/sys-root/usr/lib/../lib64/crt1.o" "/opt/aarch64-linux-gnu/aarch64-linux-gnu/sys-root/usr/lib/../lib64/crti.o" "/opt/aarch64-linux-gnu/lib/gcc/aarch64-linux-gnu/11.1.0/crtbegin.o" "-L/opt/aarch64-linux-gnu/lib/gcc/aarch64-linux-gnu/11.1.0" "-L/opt/aarch64-linux-gnu/lib/gcc/aarch64-linux-gnu/11.1.0/../../../../aarch64-linux-gnu/lib/../lib64" "-L/opt/aarch64-linux-gnu/aarch64-linux-gnu/sys-root/lib/../lib64" "-L/opt/aarch64-linux-gnu/aarch64-linux-gnu/sys-root/usr/lib/../lib64" "-L/opt/aarch64-linux-gnu/lib/gcc/aarch64-linux-gnu/11.1.0/../../../../aarch64-linux-gnu/lib" "-L/opt/aarch64-linux-gnu/aarch64-linux-gnu/sys-root/lib" "-L/opt/aarch64-linux-gnu/aarch64-linux-gnu/sys-root/usr/lib" "/tmp/--f56940.o" "-lgcc" "--as-needed" "-lgcc_s" "--no-as-needed" "-lc" "-lgcc" "--as-needed" "-lgcc_s" "--no-as-needed" "/opt/aarch64-linux-gnu/lib/gcc/aarch64-linux-gnu/11.1.0/crtend.o" "/opt/aarch64-linux-gnu/aarch64-linux-gnu/sys-root/usr/lib/../lib64/crtn.o"

Note "-target-feature" "+aes" (and also "-target-feature" "+crypto", which implies AES). And also the Fujitsu compiler in Isambard in clang mode (fcc -Nclang ...) shows "-target-feature" "+crypto".

giordano commented 2 years ago

I don't how to check it easily with GCC

Oh, I forgot I could check the macros:

sandbox:${WORKSPACE} # gcc --version
aarch64-linux-gnu-gcc (GCC) 11.1.0
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

sandbox:${WORKSPACE} # gcc -dM -E - -mcpu=a64fx < /dev/null | grep __ARM_FEATURE_
#define __ARM_FEATURE_ATOMICS 1
#define __ARM_FEATURE_SVE_VECTOR_OPERATORS 1
#define __ARM_FEATURE_SVE 1
#define __ARM_FEATURE_IDIV 1
#define __ARM_FEATURE_FP16_SCALAR_ARITHMETIC 1
#define __ARM_FEATURE_CLZ 1
#define __ARM_FEATURE_QRDMX 1
#define __ARM_FEATURE_FMA 1
#define __ARM_FEATURE_FP16_VECTOR_ARITHMETIC 1
#define __ARM_FEATURE_UNALIGNED 1
#define __ARM_FEATURE_CRC32 1
#define __ARM_FEATURE_SVE_BITS 0
#define __ARM_FEATURE_NUMERIC_MAXMIN 1
sandbox:${WORKSPACE} # gcc -dM -E - -march=armv8.2-a < /dev/null | grep __ARM_FEATURE_
#define __ARM_FEATURE_ATOMICS 1
#define __ARM_FEATURE_IDIV 1
#define __ARM_FEATURE_CLZ 1
#define __ARM_FEATURE_QRDMX 1
#define __ARM_FEATURE_FMA 1
#define __ARM_FEATURE_UNALIGNED 1
#define __ARM_FEATURE_CRC32 1
#define __ARM_FEATURE_NUMERIC_MAXMIN 1
sandbox:${WORKSPACE} # gcc -dM -E - -march=armv8.2-a+aes < /dev/null | grep __ARM_FEATURE_
#define __ARM_FEATURE_ATOMICS 1
#define __ARM_FEATURE_AES 1
#define __ARM_FEATURE_IDIV 1
#define __ARM_FEATURE_CLZ 1
#define __ARM_FEATURE_QRDMX 1
#define __ARM_FEATURE_FMA 1
#define __ARM_FEATURE_UNALIGNED 1
#define __ARM_FEATURE_CRC32 1
#define __ARM_FEATURE_NUMERIC_MAXMIN 1

GCC with -mcpu=a64fx and -march=armv8.2-a doesn't define __ARM_FEATURE_AES, while it does with -march=armv8.2-a+aes. So it does look like -mcpu=a64fx does not enable AES with -mcpu=a64fx.

For reference, clang doesn't seem to ever define __ARM_FEATURE_AES, only __ARM_FEATURE_CRYPTO (which does appear with -mcpu=a64fx):

sandbox:${WORKSPACE} # clang -dM -E - -march=armv8.2-a+aes < /dev/null | grep __ARM_FEATURE_
#define __ARM_FEATURE_ATOMICS 1
#define __ARM_FEATURE_CLZ 1
#define __ARM_FEATURE_CRC32 1
#define __ARM_FEATURE_DIRECTED_ROUNDING 1
#define __ARM_FEATURE_DIV 1
#define __ARM_FEATURE_FMA 1
#define __ARM_FEATURE_IDIV 1
#define __ARM_FEATURE_LDREX 0xF
#define __ARM_FEATURE_NUMERIC_MAXMIN 1
#define __ARM_FEATURE_QRDMX 1
#define __ARM_FEATURE_UNALIGNED 1
sandbox:${WORKSPACE} # clang -dM -E - -mcpu=a64fx < /dev/null | grep __ARM_FEATURE_
#define __ARM_FEATURE_ATOMICS 1
#define __ARM_FEATURE_CLZ 1
#define __ARM_FEATURE_CRC32 1
#define __ARM_FEATURE_CRYPTO 1
#define __ARM_FEATURE_DIRECTED_ROUNDING 1
#define __ARM_FEATURE_DIV 1
#define __ARM_FEATURE_FMA 1
#define __ARM_FEATURE_FP16_SCALAR_ARITHMETIC 1
#define __ARM_FEATURE_FP16_VECTOR_ARITHMETIC 1
#define __ARM_FEATURE_IDIV 1
#define __ARM_FEATURE_LDREX 0xF
#define __ARM_FEATURE_NUMERIC_MAXMIN 1
#define __ARM_FEATURE_QRDMX 1
#define __ARM_FEATURE_SVE 1
#define __ARM_FEATURE_UNALIGNED 1

Edit: __ARM_FEATURE_AES was introduced in LLVM 13: https://github.com/llvm/llvm-project/commit/b8baa2a9132498ea286dbb0d03f005760ecc6fdb (which I'm compiling for BinaryBuilder right now), I tried with LLVM 12.

OliverPerks commented 2 years ago

Yer can confirm +AES is not getting triggers on A64FX with GCC 11: -mcpu=native => -march=armv8.2-a+crypto+sve -mcpu=a64fx => -march=armv8.2-a+sve

giordano commented 2 years ago

-mcpu=native => -march=armv8.2-a+crypto+sve

On Isambard:

$ gcc --version
gcc (GCC) 11.1.0
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ gcc -dM -E - -mcpu=native < /dev/null | grep __ARM_FEATURE
#define __ARM_FEATURE_ATOMICS 1
#define __ARM_FEATURE_SVE_VECTOR_OPERATORS 1
#define __ARM_FEATURE_SVE 1
#define __ARM_FEATURE_IDIV 1
#define __ARM_FEATURE_FP16_SCALAR_ARITHMETIC 1
#define __ARM_FEATURE_CLZ 1
#define __ARM_FEATURE_QRDMX 1
#define __ARM_FEATURE_FMA 1
#define __ARM_FEATURE_FP16_VECTOR_ARITHMETIC 1
#define __ARM_FEATURE_UNALIGNED 1
#define __ARM_FEATURE_SHA2 1
#define __ARM_FEATURE_CRC32 1
#define __ARM_FEATURE_SVE_BITS 0
#define __ARM_FEATURE_NUMERIC_MAXMIN 1

-mcpu=native does not enable AES.