Open jtramm opened 5 years ago
I'm currently having some issues with it including JIT time in hipSYCL, which causes skewed performance results. Might I suggest maybe running the kernel several times, and then ignoring the first run? (potentially even averaging over the other results to get a more stable result?) The difference between first run and the others can then be a reliable measure of JIT overhead+data transfer overhead.
EDIT: Here's a prototype https://github.com/illuhad/RSBench/commit/1fb1bf511e0940173462be344852d6112dc76447
For the SYCL version, we are currently reporting runtime statistics for both the kernel initialization / JIT compiling as well as the actual execution. This may result in some issues on certain systems, e.g.:
Timing these things as we are now included some assumptions as to the asynchronous behavior of SYCL that do not appear to be true in all cases with all compilers on all machines. Instead, we should just time only the total runtime.