ANL-CESAR / RSBench

A mini-app to represent the multipole resonance representation lookup cross section algorithm.
MIT License
21 stars 28 forks source link

SYCL "simulation only" runtime statistics misleading #8

Open jtramm opened 5 years ago

jtramm commented 5 years ago

For the SYCL version, we are currently reporting runtime statistics for both the kernel initialization / JIT compiling as well as the actual execution. This may result in some issues on certain systems, e.g.:

Total Time Statistics (SYCL+OpenCL Init / JIT Compilation + Simulation Kernel)
Runtime:                XXXXXXX seconds
Lookups:               XXXXXXXXXX
Lookups/s:            XXXXXXXXXX
Simulation Kernel Only Statistics
Runtime:               0.00001 seconds
Lookups/s:             1,000,000,000,000,000
Verification checksum: (Valid)

Timing these things as we are now included some assumptions as to the asynchronous behavior of SYCL that do not appear to be true in all cases with all compilers on all machines. Instead, we should just time only the total runtime.

illuhad commented 1 year ago

I'm currently having some issues with it including JIT time in hipSYCL, which causes skewed performance results. Might I suggest maybe running the kernel several times, and then ignoring the first run? (potentially even averaging over the other results to get a more stable result?) The difference between first run and the others can then be a reliable measure of JIT overhead+data transfer overhead.

EDIT: Here's a prototype https://github.com/illuhad/RSBench/commit/1fb1bf511e0940173462be344852d6112dc76447