NVlabs / NVBit

198 stars 18 forks source link

Last instruction not instrumented when setting instrumenting position to be `IPOINT_AFTER` #125

Open William-An opened 5 months ago

William-An commented 5 months ago

Problem

When writing the tracer tool for NVBit, if the instrumentation location is set to IPOINT_AFTER, the reported result will miss the last instruction in every warp.

# Using provided vecadd program
# Using `IPOINT_BEFORE`
kernel 0 - _Z6vecAddPdS_S_i - #thread-blocks 98,  kernel instructions 50077, total instructions 50077
Final sum = 100000.000000; sum/n = 1.000000 (should be ~1)
Total app instructions: 50077

# Using `IPOINT_AFTER`
kernel 0 - _Z6vecAddPdS_S_i - #thread-blocks 98,  kernel instructions 46941, total instructions 46941
Final sum = 100000.000000; sum/n = 1.000000 (should be ~1)
Total app instructions: 46941

The vecadd program add 100000 elements with block size of 1024, which creates $98 * 1024/32 = 3136$ warps, exactly the difference between the two runs. When using accel-sim tracing tool, it shows that the missing instruction is the last instruction for every warp.

How to recreate

Modify the instr_count tool to instruments at IPOINT_AFTER instead of IPOINT_BEFORE.

Test environment

  1. CUDA: 11.0
  2. CUDA Driver: 530.41.03
  3. GCC: 7.5.0
  4. OS: Ubuntu 18.04.6 LTS
William-An commented 4 months ago

Looks like taken BRA instruction is also not handled properly when instrumentation location is set to IPOINT_AFTER.