Closed crozhon closed 3 years ago
We need a little bit more information to try to see what is going on. Which GPU is it? Is there a way you can point us to the exact same version of the application? Thanks,
Thanks for responding so quickly. This is with an RTX2080Ti, SM7.5.
The application is pbrtv4 from the latest master (ea9e5fdef6), which is available here on github and is pretty easy to build. All you to build is OPTIX and it uses cmake. It's a ray-tracer that's been adapted from CPU-code, so some of the kernels look a bit nasty.
I was able to isolate a specific set of kernels as the issue. When you comment out the contents of EvaluateMaterialAndBSDF specified in src/pbrt/gpu/surfscatter.cpp (https://github.com/mmp/pbrt-v4/blob/master/src/pbrt/gpu/surfscatter.cpp), the problem is eliminated and I'm able to instrument the application as expected. So it seems related to the lambda specified by that function. I can try and come up with a smaller self-contained example if this isn't enough to go on.
I was able to reproduce on my side and I will try to work on it this week. Thanks for pointing this out!
The issue should be resolved in NVBit version 1.5 (just released). Please let us know if it works for you. Thanks again for reporting.
Forgot to comment, but this worked perfectly. Thanks so much for your effort on it.
I'm trying to use NVBit to profile an application. I obtain a Segmentation Fault after the first call to cudaMemcpyToSymbol. It seems that nvbit_at_init() and nvbit_at_cuda_event() are being called. I also tried CUDA_INJECTION64_PATH instead of LD_PRELOAD.
Here's a stack trace from cuda-gdb. It seems there's a recursive loop of sorts in compute_max_stack_size? Any ideas why this might be?
System Configuration: nvcc: release 11.0, V11.0.194 Driver Version: 450.57 CUDA Version: 11.0
I confirmed everything works properly with the vectoradd example, so I don't think it's an issue with my system configuration. Does anyone have any insight into what's going on here?