Open valassi opened 3 years ago
Note: also Juwels Booster has Zen2 with only AVX2 (actually 7402 and not 7302, but still Zen2) , see PR #381
A summary of results at Juwels Booster is here (thanks to @roiser for getting access!) https://github.com/madgraph5/madgraph4gpu/blob/a69d7f9ea37dd6445cd375e6b29a33f6a884e681/epochX/cudacpp/tput/summaryTable_juwels.txt#L50
*** FPTYPE=d ******************************************************************
+++ REVISION df441ad +++
On jwb0085.juwels [CPU: AMD EPYC 7402 24-Core Processor] [GPU: 4x NVIDIA A100-SXM4-40GB]:
[nvcc 11.5.50 (gcc 11.2.0)]
HELINL=0 HRDCOD=0
eemumu ggtt ggttg ggttgg ggttggg
CUD/none 1.57e+09 1.69e+08 2.37e+07 9.45e+05 2.04e+04
CPP/none 2.26e+06 2.63e+05 2.89e+04 2.17e+03 9.11e+01
CPP/sse4 4.34e+06 3.94e+05 5.41e+04 4.15e+03 1.72e+02
CPP/avx2 8.58e+06 8.69e+05 1.26e+05 9.92e+03 3.68e+02
*** FPTYPE=f ******************************************************************
+++ REVISION df441ad +++
On jwb0085.juwels [CPU: AMD EPYC 7402 24-Core Processor] [GPU: 4x NVIDIA A100-SXM4-40GB]:
[nvcc 11.5.50 (gcc 11.2.0)]
HELINL=0 HRDCOD=0
eemumu ggtt ggttg ggttgg ggttggg
CUD/none 3.80e+09 4.78e+08 5.73e+07 1.80e+06 3.74e+04
CPP/none 2.36e+06 2.74e+05 3.09e+04 2.30e+03 9.95e+01
CPP/sse4 8.72e+06 6.15e+05 1.08e+05 9.14e+03 3.84e+02
CPP/avx2 1.74e+07 1.50e+06 2.50e+05 1.98e+04 7.36e+02
There are nice x4 and x8 speedups from AVX2 (there is no AVX512).
Note that the no-vectorization C++ performance on this single threaded test is half way between an Intel Silver and an Intel Gold, for instance for ggttggg CPP/none double:
It would be useful to build and test on AMD x86 CPUs and not only Intel.
Thanks to @lfield and his colleagues I have had access to an AMD EPYC at CERN. The results are in PR #238.
Note that that node does not yet support AVX512. It is an AMD EPYC 7302, so apparently a Zen2 https://en.wikichip.org/wiki/amd/epyc/7302
Instead AVX512 will be supported in 2021 by Zen4 https://www.techpowerup.com/279129/amd-zen-4-microarchitecture-to-support-avx-512
Anyway, the results on this older Zen2 are already quite interesting. Thanks again Laurence!