NVlabs / NVBit

199 stars 18 forks source link

Tool disabled by Python multiprocessing #77

Open damtharvey opened 2 years ago

damtharvey commented 2 years ago

I have been trying to use NVBit on CUDA kernels launched by Python, but it seems to not be instrumenting if Python uses multiprocessing. I've also tried CUDA_INJECTION64_PATH instead of LD_PRELOAD and it still seems to be disabled. Is there a way around it?

ovilla commented 2 years ago

Hi, can you please provide a mini example to reproduce this? I am thinking of a single python file using multiprocessing, calling a CUDA kernel with just a printf hello world in it (few hundred lines max). I am not sure if we will find a solution but would be great to have a small example to see what we can do. We heard about similar issues before, but no one was able to provide a mini example and they always pointed out large ML frameworks which are very complex to setup and debug for these issues. Thanks!

damtharvey commented 2 years ago

Actually I think it works. Not sure what happened before. Will update you if I find out.