pytorch / cpuinfo

CPU INFOrmation library (x86/x86-64/ARM/ARM64, Linux/Windows/Android/macOS/iOS)
BSD 2-Clause "Simplified" License
962 stars 306 forks source link

Add detection for Intel Advanced Matrix Extensions (AMX) instructions #231

Closed mingfeima closed 3 months ago

mingfeima commented 3 months ago

Tested using intel SDE: https://www.intel.com/content/www/us/en/download/684897/intel-software-development-emulator.html

Test scripts:

bash scripts/local-build.sh

ISAS=()
OPTIONS=()
PLATFORMS=()

OPTIONS+=(-quark); PLATFORMS+=("Quark")
OPTIONS+=(-p4); PLATFORMS+=("Pentium4")
OPTIONS+=(-p4p); PLATFORMS+=("Pentium4 Prescott")
OPTIONS+=(-mrm); PLATFORMS+=("Merom")
OPTIONS+=(-pnr); PLATFORMS+=("Penryn")
OPTIONS+=(-nhm); PLATFORMS+=("Nehalem")
OPTIONS+=(-wsm); PLATFORMS+=("Westmere")
OPTIONS+=(-snb); PLATFORMS+=("Sandy Bridge")
OPTIONS+=(-ivb); PLATFORMS+=("Ivy Bridge")
OPTIONS+=(-hsw); PLATFORMS+=("Haswell")
OPTIONS+=(-bdw); PLATFORMS+=("Broadwell")
OPTIONS+=(-slt); PLATFORMS+=("Saltwell")
OPTIONS+=(-slm); PLATFORMS+=("Silvermont")
OPTIONS+=(-glm); PLATFORMS+=("Goldmont")
OPTIONS+=(-glp); PLATFORMS+=("Goldmont Plus")
OPTIONS+=(-tnt); PLATFORMS+=("Tremont")
OPTIONS+=(-snr); PLATFORMS+=("Snow Ridge")
OPTIONS+=(-skl); PLATFORMS+=("Skylake")
OPTIONS+=(-cnl); PLATFORMS+=("Cannon Lake")
OPTIONS+=(-icl); PLATFORMS+=("Ice Lake")
OPTIONS+=(-skx); PLATFORMS+=("Skylake server")
OPTIONS+=(-clx); PLATFORMS+=("Cascade Lake")
OPTIONS+=(-cpx); PLATFORMS+=("Cooper Lake")
OPTIONS+=(-icx); PLATFORMS+=("Ice Lake server")
OPTIONS+=(-knl); PLATFORMS+=("Knights landing")
OPTIONS+=(-knm); PLATFORMS+=("Knights mill")
OPTIONS+=(-tgl); PLATFORMS+=("Tiger Lake")
OPTIONS+=(-adl); PLATFORMS+=("Alder Lake")
OPTIONS+=(-mtl); PLATFORMS+=("Meteor Lake")
OPTIONS+=(-rpl); PLATFORMS+=("Raptor Lake")
OPTIONS+=(-spr); PLATFORMS+=("Sapphire Rapids")
OPTIONS+=(-gnr); PLATFORMS+=("Granite Rapids")
OPTIONS+=(-gnr256); PLATFORMS+=("Granite Rapids (AVX10.1 / 256VL)")
OPTIONS+=(-srf); PLATFORMS+=("Sierra Forest")
OPTIONS+=(-arl); PLATFORMS+=("Arrow Lake")
OPTIONS+=(-lnl); PLATFORMS+=("Lunar Lake")
OPTIONS+=(-future); PLATFORMS+=("Future chip")

ISAS+=("AMXBF16")
ISAS+=("AMXTILE")
ISAS+=("AMXINT8")
ISAS+=("AMXFP16")

SDE_BIN="/home/mingfeim/packages/sde-external-9.33.0-2024-01-07-lin/sde"

for I in "${!PLATFORMS[@]}"; do
  echo "${PLATFORMS["${I}"]}"
    for J in "${!ISAS[@]}"; do
      "${SDE_BIN}" "${OPTIONS[$I]}" -- ./build/local/isa-info | grep ${ISAS[$J]}
    done
done

Results:

Quark
SDE-ERROR: 64 bits applications are not supported by input chip: PENTIUM or by the input cpuid definition file
SDE-ERROR: 64 bits applications are not supported by input chip: PENTIUM or by the input cpuid definition file
SDE-ERROR: 64 bits applications are not supported by input chip: PENTIUM or by the input cpuid definition file
SDE-ERROR: 64 bits applications are not supported by input chip: PENTIUM or by the input cpuid definition file
Pentium4
SDE-ERROR: 64 bits applications are not supported by input chip: PENTIUM4 or by the input cpuid definition file
SDE-ERROR: 64 bits applications are not supported by input chip: PENTIUM4 or by the input cpuid definition file
SDE-ERROR: 64 bits applications are not supported by input chip: PENTIUM4 or by the input cpuid definition file
SDE-ERROR: 64 bits applications are not supported by input chip: PENTIUM4 or by the input cpuid definition file
Pentium4 Prescott
        AMXBF16: no
        AMXTILE: no
        AMXINT8: no
        AMXFP16: no
Merom
        AMXBF16: no
        AMXTILE: no
        AMXINT8: no
        AMXFP16: no
Penryn
        AMXBF16: no
        AMXTILE: no
        AMXINT8: no
        AMXFP16: no
Nehalem
        AMXBF16: no
        AMXTILE: no
        AMXINT8: no
        AMXFP16: no
Westmere
        AMXBF16: no
        AMXTILE: no
        AMXINT8: no
        AMXFP16: no
Sandy Bridge
        AMXBF16: no
        AMXTILE: no
        AMXINT8: no
        AMXFP16: no
Ivy Bridge
        AMXBF16: no
        AMXTILE: no
        AMXINT8: no
        AMXFP16: no
Haswell
        AMXBF16: no
        AMXTILE: no
        AMXINT8: no
        AMXFP16: no
Broadwell
        AMXBF16: no
        AMXTILE: no
        AMXINT8: no
        AMXFP16: no
Saltwell
        AMXBF16: no
        AMXTILE: no
        AMXINT8: no
        AMXFP16: no
Silvermont
        AMXBF16: no
        AMXTILE: no
        AMXINT8: no
        AMXFP16: no
Goldmont
        AMXBF16: no
        AMXTILE: no
        AMXINT8: no
        AMXFP16: no
Goldmont Plus
        AMXBF16: no
        AMXTILE: no
        AMXINT8: no
        AMXFP16: no
Tremont
        AMXBF16: no
        AMXTILE: no
        AMXINT8: no
        AMXFP16: no
Snow Ridge
        AMXBF16: no
        AMXTILE: no
        AMXINT8: no
        AMXFP16: no
Skylake
        AMXBF16: no
        AMXTILE: no
        AMXINT8: no
        AMXFP16: no
Cannon Lake
        AMXBF16: no
        AMXTILE: no
        AMXINT8: no
        AMXFP16: no
Ice Lake
        AMXBF16: no
        AMXTILE: no
        AMXINT8: no
        AMXFP16: no
Skylake server
        AMXBF16: no
        AMXTILE: no
        AMXINT8: no
        AMXFP16: no
Cascade Lake
        AMXBF16: no
        AMXTILE: no
        AMXINT8: no
        AMXFP16: no
Cooper Lake
        AMXBF16: no
        AMXTILE: no
        AMXINT8: no
        AMXFP16: no
Ice Lake server
        AMXBF16: no
        AMXTILE: no
        AMXINT8: no
        AMXFP16: no
Knights landing
        AMXBF16: no
        AMXTILE: no
        AMXINT8: no
        AMXFP16: no
Knights mill
        AMXBF16: no
        AMXTILE: no
        AMXINT8: no
        AMXFP16: no
Tiger Lake
        AMXBF16: no
        AMXTILE: no
        AMXINT8: no
        AMXFP16: no
Alder Lake
        AMXBF16: no
        AMXTILE: no
        AMXINT8: no
        AMXFP16: no
Meteor Lake
        AMXBF16: no
        AMXTILE: no
        AMXINT8: no
        AMXFP16: no
Raptor Lake
        AMXBF16: no
        AMXTILE: no
        AMXINT8: no
        AMXFP16: no
Sapphire Rapids
        AMXBF16: yes
        AMXTILE: yes
        AMXINT8: yes
        AMXFP16: no
Granite Rapids
        AMXBF16: yes
        AMXTILE: yes
        AMXINT8: yes
        AMXFP16: yes
Granite Rapids (AVX10.1 / 256VL)
        AMXBF16: yes
        AMXTILE: yes
        AMXINT8: yes
        AMXFP16: yes
Sierra Forest
        AMXBF16: no
        AMXTILE: no
        AMXINT8: no
        AMXFP16: no
Arrow Lake
        AMXBF16: no
        AMXTILE: no
        AMXINT8: no
        AMXFP16: no
Lunar Lake
        AMXBF16: no
        AMXTILE: no
        AMXINT8: no
        AMXFP16: no
Future chip
        AMXBF16: yes
        AMXTILE: yes
        AMXINT8: yes
        AMXFP16: yes
mingfeima commented 3 months ago

@malfet @xuhancn @jgong5 could you please help review this one ? thx!

mingfeima commented 3 months ago

The indentation looks problematic (guess tab vs. space). Others LGTM.

Fixed!

mingfeima commented 3 months ago

Overall LGTM, but let's add a separator for those names to match how it's done for AVX512.

Another question: are there CPUs on the market that say has fp16 but not int8 AMX support?

Would be good to add a much more details description with links back to the docs explaning what those extensions do and what CPUs support them

Currently we do not have platforms that supports amx-fp16 but not amx-int8. I put a note in before the amx detection functions:

/* [NOTE] Intel Advanced Matrix Extensions (AMX) detection
 *
 * I.  AMX is a new extensions to the x86 ISA to work on matrices, consists of
 *   1) 2-dimentional registers (tiles), hold sub-matrices from larger matrices in memory
 *   2) Accelerator called Tile Matrix Multiply (TMUL), contains instructions operating on tiles
 *
 * II. Platforms that supports AMX:
 * +-----------------+-----+----------+----------+----------+----------+
 * |    Platforms    | Gen | amx-bf16 | amx-tile | amx-int8 | amx-fp16 |
 * +-----------------+-----+----------+----------+----------+----------+
 * | Sapphire Rapids | 4th |   YES    |   YES    |   YES    |    NO    |
 * +-----------------+-----+----------+----------+----------+----------+
 * | Emerald Rapids  | 5th |   YES    |   YES    |   YES    |    NO    |
 * +-----------------+-----+----------+----------+----------+----------+
 * | Granite Rapids  | 6th |   YES    |   YES    |   YES    |   YES    |
 * +-----------------+-----+----------+----------+----------+----------+
 *
 * Reference: https://www.intel.com/content/www/us/en/products/docs
 *    /accelerator-engines/advanced-matrix-extensions/overview.html

@malfet If you find a better place to put this note, please let me know!