daniilidis-group / neural_renderer

A PyTorch port of the Neural 3D Mesh Renderer
Other
1.13k stars 252 forks source link

Cannot step inside the cuda kernel. Ask for help about how to debug. #65

Open maphysart opened 5 years ago

maphysart commented 5 years ago

Thanks for your library. It is cool. I saw the post https://github.com/daniilidis-group/neural_renderer/issues/36 say the library can work with pytorch 1.1. "Latest release (1.1.3) works ok for me with torch 1.1.0 and CUDA 10.0 on ubuntu 16.04."

I try to test it on my pc but I failed at a simple test. Any suggestion is really appreciated.

Following this blog, I try to debug a simple cuda cpp extension at first.
https://chrischoy.github.io/research/pytorch-extension-with-makefile/ The source code is here: https://github.com/chrischoy/MakePytorchPlusPlus/

But I can not step inside the cuda kernel. The gcc version is gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0 , pytorch 1.1 with cuda 10.0 in anaconda environment.

The error info is : Reading symbols from /usr/lib/x86_64-linux-gnu/libcuda.so.1...(no debugging symbols found)...done. Reading symbols from /usr/lib/x86_64-linux-gnu/libnvidia-fatbinaryloader.so.410.104...(no debugging symbols found)...done. 0x00007ffcb77d9b62 in clock_gettime () $1 = 86834896 cuda-gdb/7.12/gdb/block.c:456: internal-error: set_block_compunit_symtab: Assertion `gb->compunit_symtab == NULL' failed. A problem internal to GDB has been detected, further debugging may prove unreliable. Quit this debugging session? (y or n) y

I wonder if how you debug the neural render codes. I wish to add the per pixel normal shading feature, so that is why I need to debug the codes. Thanks a lot.

andyljones commented 5 years ago

I don't think this problem is specific to this library.

(Got here from Googling that error - I don't have anything to do with the dev team for this project)

andyljones commented 5 years ago

This is fixed in CUDA toolkit 10.1.243.