Open jeffhammond opened 2 years ago
Hi Jeff - I worked on this benchmark back in the day (2014?) and a large part of our efforts were focused on keeping the compiler from optimizing out computation, sometimes to the point of adding redundant and indirect references. It turns out that compilers are actually really good at optimizing "useless" code, so I'm not surprised that it is reporting erroneous results for the M1.
Realistically, some of the feeds and speeds (ie, BW tests) are still useful, but SHOC has not been fully updated in some time. As such, I'd be very wary of any results from the MaxFlops test unless/until a new version of the code is released.
The MaxFlops test reports >100 petaflop/s FP32 for Apple M1, which false suggest that my M1 Air laptop is the second most powerful supercomputer in Europe.
Disabling compiler optimizations (below) makes the problem smaller but does not fix it, as the results are still 4-17 teraflop/s FP32, which is greater than the performance claimed by Apple or reported by third-parties.
It would seem that the benchmark should be modified to prevent compilers from removing large portions of the computation.