When writing the tracer tool for NVBit, if the instrumentation location is set to IPOINT_AFTER, the reported result will miss the last instruction in every warp.
# Using provided vecadd program
# Using `IPOINT_BEFORE`
kernel 0 - _Z6vecAddPdS_S_i - #thread-blocks 98, kernel instructions 50077, total instructions 50077
Final sum = 100000.000000; sum/n = 1.000000 (should be ~1)
Total app instructions: 50077
# Using `IPOINT_AFTER`
kernel 0 - _Z6vecAddPdS_S_i - #thread-blocks 98, kernel instructions 46941, total instructions 46941
Final sum = 100000.000000; sum/n = 1.000000 (should be ~1)
Total app instructions: 46941
The vecadd program add 100000 elements with block size of 1024, which creates $98 * 1024/32 = 3136$ warps, exactly the difference between the two runs. When using accel-sim tracing tool, it shows that the missing instruction is the last instruction for every warp.
How to recreate
Modify the instr_count tool to instruments at IPOINT_AFTER instead of IPOINT_BEFORE.
Problem
When writing the tracer tool for NVBit, if the instrumentation location is set to
IPOINT_AFTER
, the reported result will miss the last instruction in every warp.The vecadd program add 100000 elements with block size of 1024, which creates $98 * 1024/32 = 3136$ warps, exactly the difference between the two runs. When using accel-sim tracing tool, it shows that the missing instruction is the last instruction for every warp.
How to recreate
Modify the
instr_count
tool to instruments atIPOINT_AFTER
instead ofIPOINT_BEFORE
.Test environment