nicknytko / numml

MIT License
12 stars 2 forks source link

What tool are you using to debug pytorch #14

Closed HongLouyemeng closed 7 months ago

HongLouyemeng commented 7 months ago

How did you debug the .CPP and .py. vscode or gpd , or other tool QAQ?

nicknytko commented 7 months ago

Hi! I've mostly been sticking to debugging the C++ code with GDB and the CUDA debugger. Writing tests has also been helpful to make sure everything is working properly. Are you running into issues trying to run/install?

HongLouyemeng commented 7 months ago

Hi! I've mostly been sticking to debugging the C++ code with GDB and the CUDA debugger. Writing tests has also been helpful to make sure everything is working properly. Are you running into issues trying to run/install?

yes,I built torch with pytorch debug=1 and using Python C++ Debugger to debug the torch.distributed.allreduce and torch.distributed.ProcessGroupGloo.cpp's allreduce , but debug can't get past the dist.allreduce() into ProcessGroupGloo.cpp's breakpoints.if you could give some advice that would be great! And thanks for the confirmation and the info on the CUDA debug!

nicknytko commented 7 months ago

Does it work if you try running from the command line with gdb? Something like gdb --args python3 script.py, then set the breakpoint normally. I don't have too much experience with using VSCode, unfortunately.

HongLouyemeng commented 7 months ago

Does it work if you try running from the command line with gdb? Something like gdb --args python3 script.py, then set the breakpoint normally. I don't have too much experience with using VSCode, unfortunately.

I haven't tried this before, thank you for sharing your experience OVO

nicknytko commented 7 months ago

Sure -- I'm closing this issue but feel free to continue following up in a discussion (https://github.com/nicknytko/numml/discussions/landing)