Open dbeckwith opened 2 years ago
CUDA debug info has been... weird, really weird. This is what i have figured out so far:
-generate-line-info
also yields a segfault.-generate-line-info
works (it makes debug info in the PTX), but, it causes libnvvm to not optimize the program (what???)Moreover, cuda-gdb does not work on windows, i tried using cuda's VS debugger but it seems to be CUDA C++ only, i found no way of running it on an arbitrary executable (someone might be able to figure this out).
So overall, its not well supported currently, since debug info is kind of low on the priority list. One thing i need to finish first is DCE (dead code elimination), because debug info generates a glorious amount of PTX code, and it gets really big really quickly.
Update, i was able to fix the segfaults during debug info generation, now the only issue is that for some reason kernels are so much slower on debug (but the code is optimized), 600ms vs 30ms for example. For some reason cuda uses 200 regs on debug, but 96 without debug, which is probably what makes it so slow. I will open a forum post/ask someone at nvidia about this
Just wanted to say thank you for working on the debugger! Whenever I learn anything new related to programming, the debugger is like my safety blanket lol. Because no matter how magical/crazy a piece of code looks, I know that as long as I can step through the code and see the values of my variables change in real-time, I can figure anything out.
Yea, I currently have a kernel which is exhibiting some serious warp stalling, and I'm not really sure where this is in the actual kernel source.
Any tips/tricks folks might recommend to try to pin this down pre-debugger support?
UPDATE: Disregard, I found the issue by way of deduction. Some foolish memory access patterns.
Super excited to try out this project, I've just been reading through the Guide so far. I couldn't find any pages on debugging GPU kernels at runtime (there's a page on debugging the codegen but not the live kernel itself). I think it would be great if debugging was mentioned somewhere, at least to say if it isn't well-supported yet or maybe link to some external resources on CUDA debugging.