BLAKE3-team / BLAKE3

the official Rust and C implementations of the BLAKE3 cryptographic hash function
Apache License 2.0
5.05k stars 345 forks source link

CPU feature detection not detecting AVX512 for "Intel Xeon W” processor #232

Open akyrtzi opened 2 years ago

akyrtzi commented 2 years ago

It seems to me that BLAKE3’s CPU feature detection is not working as expected. I have "Intel Xeon W” processor (on macOS) which shows that AVX512 is supported:

$ sysctl -a | grep machdep.cpu.leaf7_features 
machdep.cpu.leaf7_features: RDWRFSGS TSC_THREAD_OFFSET BMI1 HLE AVX2 FDPEO SMEP BMI2 ERMS INVPCID RTM PQM FPU_CSDS MPX PQE AVX512F AVX512DQ RDSEED ADX SMAP CLFSOPT CLWB IPT AVX512CD AVX512BW AVX512VL MDCLEAR TSXFA IBRS STIBP L1DF ACAPMSR SSBD

However the checks for AVX512F and AVX512VL using the cpu_feature function fail (it reports that these features don’t exist) so it dispatches to the AVX2 implementation instead of the AVX512 one. When I modify and force the cpu_feature function to report that AVX512 is available, then the AVX512 implementation gets used and gives a further boost in performance.

oconnor663 commented 2 years ago

@sneves what do you think?

sneves commented 2 years ago

MacOS issue.

akyrtzi commented 2 years ago

numpy fixed same issue here: https://github.com/numpy/numpy/pull/19362

sneves commented 2 years ago

This other MacOS bug makes me think that AVX-512 being effectively disabled there is probably for the best.

akyrtzi commented 2 years ago

This other MacOS bug makes me think that AVX-512 being effectively disabled there is probably for the best.

This has been fixed in macOS 12.2 (21D49), according to https://github.com/golang/go/issues/49233