archspec / archspec-json

Other
20 stars 33 forks source link

A64FX processor in Isambard 2 not recognized by archspec #23

Closed boegel closed 2 years ago

boegel commented 3 years ago

I'm seeing this on the Isambard 2 A64FX partition

$ archspec --version
archspec, version 0.1.2
$ archspec cpu
aarch64

Detailed CPU info:

$ head -8 proc/cpuinfo
processor   : 0
BogoMIPS    : 200.00
Features    : fp asimd evtstrm sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm fcma dcpop sve
CPU implementer : 0x46
CPU architecture: 8
CPU variant : 0x1
CPU part    : 0x001
CPU revision    : 0

The following hack fixes the problem:

$ diff -U 7 cpu/microarchitectures.json.orig cpu/microarchitectures.json
--- cpu/microarchitectures.json.orig    2021-02-27 18:00:11.000000000 +0000
+++ cpu/microarchitectures.json 2021-02-27 18:05:58.000000000 +0000
@@ -1411,16 +1411,14 @@
     "a64fx": {
       "from": ["aarch64"],
       "vendor": "Fujitsu",
       "features": [
         "fp",
         "asimd",
         "evtstrm",
-        "aes",
-        "pmull",
         "sha1",
         "sha2",
         "crc32",
         "atomics",
         "cpuid",
         "asimdrdm",
         "fphp",

Would it make sense to just drop aes and pmull from the a64fx entry? I'm not sure why these features are not being reported on Isambard...

Perhaps related to OS?

$ cat /etc/redhat-release
Red Hat Enterprise Linux release 8.2 (Ootpa)

@christopheredsall Any input on this?

h-murai commented 3 years ago

FYI, this is the result on Fugaku (A64FX).

$ head -8 /proc/cpuinfo
processor   : 0
BogoMIPS    : 200.00
Features    : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm fcma dcpop sve
CPU implementer : 0x46
CPU architecture: 8
CPU variant : 0x1
CPU part    : 0x001
CPU revision    : 0

There are both aes and pmull in the Features line.

boegel commented 3 years ago

Could this difference be due to Fugaku being an FX1000 system, while Isambard 2 is an FX700? Cfr. https://www.fujitsu.com/global/products/computing/servers/supercomputer/specifications/

h-murai commented 3 years ago

And, this is the result for FX700 on our site. It's completely the same as that for Fugaku.

$ head -8 /proc/cpuinfo
processor   : 0
BogoMIPS    : 200.00
Features    : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm fcma dcpop sve
CPU implementer : 0x46
CPU architecture: 8
CPU variant : 0x1
CPU part    : 0x001
CPU revision    : 0
christopheredsall commented 3 years ago

Might be a difference between Fujitsu FX700 and HPE Apollo 80

tomgreen66 commented 3 years ago

I help support Isambard. Have seen some references there are optional in the ARM spec. Will try and get definitive answer.

giordano commented 2 years ago

I also noticed the same difference between Isambard and Fugaku when checking features detection in Julia: https://github.com/JuliaLang/julia/pull/44194. For what is worth, LLVM doesn't seems to enable AES for A64FX: https://github.com/llvm/llvm-project/blob/5c116d50e42f93ef4fa9a9719121378ec6b99b69/llvm/lib/Target/AArch64/AArch64.td#L997-L999

giordano commented 1 year ago

Uhm, on Isambard I'm still seeing

$ archspec cpu
aarch64
$ python3 -c "import archspec; print(archspec.__version__)"
0.1.4

:confused:

boegel commented 1 year ago

@alalazo Any idea why this wasn't resolved via #44?

giordano commented 1 year ago

@alalazo Any idea why this wasn't resolved via https://github.com/archspec/archspec-json/pull/44?

I wonder if it is because I didn't remove pmull. For the record, I have the same problem also on Ookami, which also has the ~FX700 system~ HPE Apollo 80. Feels like only FX1000 has aes+pmull.

tomgreen66 commented 1 year ago

Sounds like the has finally been fixed. Can look at any Spack issue you experience on Isambard, particular modules are expected for Cray support. I have been using a local copy with removal of aes and pmull. Now looking towards next iteration of Isambard next year.

giordano commented 1 year ago

@tomgreen66 as mentioned in https://github.com/archspec/archspec-json/pull/59#issuecomment-1339404329, the problem with archspec misdetecting the CPU seems to be solved by removing pmull, but Spack is still getting it wrong. Does your spack detect the system as a64fx or generic aarch64?

tomgreen66 commented 1 year ago

Hi - just checked. After editing archspec in latest pull of Spack to remove pmull (since Spack includes its own revision of archspec). It seems Spack supports this Cray backend and frontend feature. Just running spack arch detects aarch64 e.g.

$ . spack/share/spack/setup-env.sh
$ spack arch
cray-rhel8-aarch64

However specifying frontend I get:

$ spack arch -f
cray-rhel8-a64fx

I believe the backend logic in Spack uses an expectation of version of Cray software, modules available etc. We plan to upgrade the Cray environment on A64fx next week to 22.06 which might get Spack to detect backend better.

However you can just force the target - I used to do it like here: https://github.com/gw4-isambard/spack-configs/blob/1_isambard_basic/ISAMBARD/cray/packages.yaml

e.g.

spack install zlib arch=cray-rhel8-a64fx

However clingo bootstrapping in Spack seems to have issues on a64fx. Happy to help further or open a Spack issue to cover this.

antoine-morvan commented 1 year ago

@alalazo Any idea why this wasn't resolved via #44?

I wonder if it is because I didn't remove pmull. For the record, I have the same problem also on Ookami, which also has the ~FX700 system~ HPE Apollo 80. Feels like only FX1000 has aes+pmull.

Hello,

Just to add some precisions: we have some FX700 nodes at Atos, and they do not have AES neither PMull. After confirming with Fujitsu, it is related to the revision of the processor. The revision we have (revision C) does not have AES nor PMull. Some other revisions of the FX700 have it (at least Fujitsu has one revision with it). I am still looking for an official statement from Fujitsu about these revisions.

Latest versions of archspec/spack do not detect the CPU properly.

$ lscpu
Architecture:        aarch64
Byte Order:          Little Endian
CPU(s):              48
On-line CPU(s) list: 0-47
Thread(s) per core:  1
Core(s) per socket:  12
Socket(s):           4
NUMA node(s):        4
Vendor ID:           0x46
Model:               0
Stepping:            0x1
BogoMIPS:            200.00
NUMA node0 CPU(s):   0-11
NUMA node1 CPU(s):   12-23
NUMA node2 CPU(s):   24-35
NUMA node3 CPU(s):   36-47
Flags:               fp asimd evtstrm sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm fcma dcpop sve

$ archspec --version
archspec, version 0.2.0
$ archspec cpu
aarch64

$ spack arch
linux-rhel8-aarch64
$ spack arch -f
linux-rhel8-aarch64

But next release including latest merges should fix that :)

Cheers.

giordano commented 1 year ago

@tomgreen66

We plan to upgrade the Cray environment on A64fx next week to 22.06 which might get Spack to detect backend better.

Any update on this? I still don't see the new Cray environment on the A64FX partition, nor in any of the other Isambard partitions, where it'd also be great to have Spack play nice with the Cray environments.

tomgreen66 commented 1 year ago

Hi - maybe off-topic - but just quick update - bit trickier to update A64fx image as I had hoped so had to call in external support. Fingers crossed it will be updated in the coming days!