Closed HongLouyemeng closed 7 months ago
Hi! I've mostly been sticking to debugging the C++ code with GDB and the CUDA debugger. Writing tests has also been helpful to make sure everything is working properly. Are you running into issues trying to run/install?
Hi! I've mostly been sticking to debugging the C++ code with GDB and the CUDA debugger. Writing tests has also been helpful to make sure everything is working properly. Are you running into issues trying to run/install?
yes,I built torch with pytorch debug=1 and using Python C++ Debugger to debug the torch.distributed.allreduce and torch.distributed.ProcessGroupGloo.cpp's allreduce , but debug can't get past the dist.allreduce() into ProcessGroupGloo.cpp's breakpoints.if you could give some advice that would be great! And thanks for the confirmation and the info on the CUDA debug!
Does it work if you try running from the command line with gdb? Something like gdb --args python3 script.py
, then set the breakpoint normally. I don't have too much experience with using VSCode, unfortunately.
Does it work if you try running from the command line with gdb? Something like
gdb --args python3 script.py
, then set the breakpoint normally. I don't have too much experience with using VSCode, unfortunately.
I haven't tried this before, thank you for sharing your experience OVO
Sure -- I'm closing this issue but feel free to continue following up in a discussion (https://github.com/nicknytko/numml/discussions/landing)
How did you debug the .CPP and .py. vscode or gpd , or other tool QAQ?