coduin / epiphany-bsp

BSP implementation for the Parallella; the world's smallest supercomputer
https://jwbuurlage.github.io/epiphany-bsp/
GNU General Public License v3.0
27 stars 14 forks source link

Explain low r value from bsp-bench #2

Closed usewits closed 9 years ago

usewits commented 9 years ago

Measurements of r with bsp-bench are too low. This probably is due to an error in bsp-bench.

Tombana commented 9 years ago

This was caused by the use of the wrong timer that measured cpu time instead of wall-time. With the correct timer we found 11.9 Mflops/s per core which seems reasonable enough. Furthermore when comparing the wall-timer with the the clock-cycle timer on the epiphany chip, it seemed to correspond to a clockspeed of 600 Mhz

usewits commented 9 years ago

11.9 Mflops/s per core is still a lot lower than expected on 600 Mhz; it corresponds to 50 clockcycles per flop.

usewits commented 9 years ago

Using the -O3 compiler flag was more important to performance in a simple flops measurement loop than expected! We now measure 152 Mflops/s per core, this corresponds to roughly 4 clockcycles per flop, a very realistic figure.