NVlabs / NVBit

224 stars 21 forks source link

CUDA 12.0 / Driver > 510 - Unsupported? #115

Open Floruaaa666 opened 1 year ago

Floruaaa666 commented 1 year ago

I'm trying to run the instr_count tool with the vectoradd example, and it appears to hang on CUDA 12.0. We're on Linux x86_64 with a GV100 and CUDA 12.0, Driver 525. I see that in the README it says, CUDA version: >= 8.0 && <= 11.x and Driver Version < 510. Are there plans to update NVBit to support these newer CUDA and driver versions?

In case it helps, here's a stacktrace I can get from GDB when I interrupt the program:

#0  __futex_abstimed_wait_common (cancel=false, private=0, abstime=0x0, clockid=0, expected=2, futex_word=0x555555638eb8)
    at ./nptl/futex-internal.c:103
#1  __GI___futex_abstimed_wait64 (futex_word=futex_word@entry=0x555555638eb8, expected=expected@entry=2, 
    clockid=clockid@entry=0, abstime=abstime@entry=0x0, private=private@entry=0) at ./nptl/futex-internal.c:128
#2  0x00007ffff7bbb395 in __pthread_rwlock_wrlock_full64 (abstime=0x0, clockid=0, rwlock=0x555555638eb0)
    at ./nptl/pthread_rwlock_common.c:829
#3  ___pthread_rwlock_wrlock (rwlock=0x555555638eb0) at ./nptl/pthread_rwlock_wrlock.c:26
#4  0x00007ffff6097687 in ?? () from /lib/x86_64-linux-gnu/libcuda.so.1
#5  0x00007ffff60f16d1 in ?? () from /lib/x86_64-linux-gnu/libcuda.so.1
#6  0x00007ffff7ed6a4c in __cudart667 () from ../../tools/instr_count/instr_count.so
#7  0x00007ffff7f2bbcd in __cudart679 () from ../../tools/instr_count/instr_count.so
#8  0x00007ffff7ed9f0e in __cudart515 () from ../../tools/instr_count/instr_count.so
#9  0x00007ffff7ee6104 in __cudart1329 () from ../../tools/instr_count/instr_count.so
#10 0x00007ffff7bb9f68 in __pthread_once_slow (once_control=0x7ffff7fb9408 <__cudart2818>, 
    init_routine=0x7ffff7ee5e10 <__cudart1329>) at ./nptl/pthread_once.c:116
#11 0x00007ffff7f31979 in __cudart1608 () from ../../tools/instr_count/instr_count.so
#12 0x00007ffff7eda267 in __cudart513 () from ../../tools/instr_count/instr_count.so
#13 0x00007ffff7f2516f in cudaLaunchKernel () from ../../tools/instr_count/instr_count.so
#14 0x00007ffff7e78110 in __device_stub__Z24load_module_nvbit_kerneli(int) () from ../../tools/instr_count/instr_count.so
#15 0x00007ffff7e78197 in nvbit_at_context_init_hook () from ../../tools/instr_count/instr_count.so
#16 0x00007ffff7e846fa in Nvbit::create_ctx(CUctx_st*) () from ../../tools/instr_count/instr_count.so
#17 0x00007ffff7e8d9d8 in nvbitToolsCallbackFunc(void*, CUtools_cb_domain_enum, unsigned int, void const*) ()
   from ../../tools/instr_count/instr_count.so
agalup commented 1 year ago

+1