Fixing language surrounding device peak performance

olcf / olcf-user-docs

Sources for the Oak Ridge Leadership Computing Facility User Documentation

https://docs.olcf.ornl.gov

60 stars 110 forks source link

Fixing language surrounding device peak performance #868

Closed hagertnl closed 3 months ago

hagertnl commented 3 months ago

The 26.5 TFLOPS per GCD is an outdated number. Back in 2022 we lowered the GPU's compute frequency to 1700 MHz, bringing the peak TFLOPS from 26.5 down to 23.9. I've also updated the language to specify that the Matrix cores have a peak FLOP count 2x higher than the vector units. This is documented in the CU diagram and in the roofline profiling section.