Open seanzhuh opened 4 months ago
I just managed to set breakpoints using the following setup.py file and run DEBUG=1 python setup.py build develop
in case it is helpful to anyone.
from setuptools import setup
from torch.utils.cpp_extension import CUDAExtension, BuildExtension
import os
os.path.dirname(os.path.abspath(__file__))
setup(
name="diff_gaussian_rasterization",
packages=['diff_gaussian_rasterization'],
ext_modules=[
CUDAExtension(
name="diff_gaussian_rasterization._C",
sources=[
"cuda_rasterizer/rasterizer_impl.cu",
"cuda_rasterizer/forward.cu",
"cuda_rasterizer/backward.cu",
"rasterize_points.cu",
"ext.cpp"],
extra_compile_args={"nvcc": ["-O0", "-Xcompiler", "-fPIC", "-G", "-g",
"-I" + os.path.join(os.path.dirname(os.path.abspath(__file__)), "third_party/glm/")],
'cxx': ["-g"]},
extra_link_args=["-shared"]
)
],
cmdclass={
'build_ext': BuildExtension
}
)
Though set breakpoints is successful, however, when I run os.system('python train.py'), it seems it will create a subprocess that cuda-gdb is not attached to. cuda-gdb shows 'detaching after vfork from child processs 295097' whereas my python process is 294177.
I just managed to set breakpoints using the following setup.py file and run
DEBUG=1 python setup.py build develop
in case it is helpful to anyone.from setuptools import setup from torch.utils.cpp_extension import CUDAExtension, BuildExtension import os os.path.dirname(os.path.abspath(__file__)) setup( name="diff_gaussian_rasterization", packages=['diff_gaussian_rasterization'], ext_modules=[ CUDAExtension( name="diff_gaussian_rasterization._C", sources=[ "cuda_rasterizer/rasterizer_impl.cu", "cuda_rasterizer/forward.cu", "cuda_rasterizer/backward.cu", "rasterize_points.cu", "ext.cpp"], extra_compile_args={"nvcc": ["-O0", "-Xcompiler", "-fPIC", "-G", "-g", "-I" + os.path.join(os.path.dirname(os.path.abspath(__file__)), "third_party/glm/")], 'cxx': ["-g"]}, extra_link_args=["-shared"] ) ], cmdclass={ 'build_ext': BuildExtension } )
Though set breakpoints is successful, however, when I run os.system('python train.py'), it seems it will create a subprocess that cuda-gdb is not attached to. cuda-gdb shows 'detaching after vfork from child processs 295097' whereas my python process is 294177.
In addition to this, in shell 1, you need to explicitly import diff_gaussian_rasterization in order for cuda-gdb to load the generated .so files. Otherwise, cuda-gdb still can not find it!
Do not use os.system() to execute the train.py script in shell 1. Instead you should use exec(open("./train.py").read())
.
Do not use os.system() to execute the train.py script in shell 1. Instead you should use
exec(open("./train.py").read())
.
Hello, thanks for useful information. I'm struggling to debug CUDA submodules too and finally reached here. My questions are:
exec(open("./render.py").read())
in shell1 where python shell is open?Do not use os.system() to execute the train.py script in shell 1. Instead you should use
exec(open("./train.py").read())
.Hello, thanks for useful information. I'm struggling to debug CUDA submodules too and finally reached here. My questions are:
- How can I pass command line arguments if I use
exec(open("./render.py").read())
in shell1 where python shell is open?- How can I set breakpoint to the python process running at shell1 so that it pauses while I set breakpoint on cuda-gdb in shell2?
1) I directly modify the default argument value in render.py w/o passing arguments from exec(open("./render.py").read())
.
2) Well, you don't need to set breakpoints in shell which runs python script, it will automatically pause as long as you set breakpoints in cuda-gdb before. If you intend to debug python code, I suggest using pdb. Debug both python and cuda may take some time to explore I guess.
Hey @seanzhuh, thanks for this detailed explanation. I tried your instructions and noticed i couldn't attach the cuda-gdb to python:
cuda-gdb -p
I just managed to set breakpoints using the following setup.py file and run
DEBUG=1 python setup.py build develop
in case it is helpful to anyone.from setuptools import setup from torch.utils.cpp_extension import CUDAExtension, BuildExtension import os os.path.dirname(os.path.abspath(__file__)) setup( name="diff_gaussian_rasterization", packages=['diff_gaussian_rasterization'], ext_modules=[ CUDAExtension( name="diff_gaussian_rasterization._C", sources=[ "cuda_rasterizer/rasterizer_impl.cu", "cuda_rasterizer/forward.cu", "cuda_rasterizer/backward.cu", "rasterize_points.cu", "ext.cpp"], extra_compile_args={"nvcc": ["-O0", "-Xcompiler", "-fPIC", "-G", "-g", "-I" + os.path.join(os.path.dirname(os.path.abspath(__file__)), "third_party/glm/")], 'cxx': ["-g"]}, extra_link_args=["-shared"] ) ], cmdclass={ 'build_ext': BuildExtension } )
Though set breakpoints is successful, however, when I run os.system('python train.py'), it seems it will create a subprocess that cuda-gdb is not attached to. cuda-gdb shows 'detaching after vfork from child processs 295097' whereas my python process is 294177.
In addition to this, in shell 1, you need to explicitly import diff_gaussian_rasterization in order for cuda-gdb to load the generated .so files. Otherwise, cuda-gdb still can not find it!
Hello! I follow your commands but my cuda-gdb can not find generated.so
Specifically, I rewrite a simply debug.py script only containing render process with all needed arguments saved before for debugging.
Here are my debug.py:
from gaussian_renderer import render_debug
import diff_gaussian_rasterization
path = './debug.pth'
while True:
results=render_debug(path)
print('x')
render_debug is a rewritten function in gaussian_renderer,
import torch
import math
from typing import Union
from diff_gaussian_rasterization import GaussianRasterizationSettings, GaussianRasterizer
def render_debug(path):
debug_dict = torch.load(path)
# Create zero tensor. We will use it to make pytorch return gradients of the 2D (screen-space) means
screenspace_points = torch.zeros_like(debug_dict['means3D'], dtype=debug_dict['means3D'].dtype, requires_grad=True, device="cuda") + 0
try:
screenspace_points.retain_grad()
except:
pass
raster_settings = GaussianRasterizationSettings(
image_height=debug_dict['image_height'],
image_width=debug_dict['image_width'],
tanfovx=debug_dict['tanfovx'],
tanfovy=debug_dict['tanfovy'],
bg=debug_dict['bg'],
scale_modifier=debug_dict['scale_modifier'],
viewmatrix=debug_dict['viewmatrix'],
projmatrix=debug_dict['projmatrix'],
sh_degree=debug_dict['sh_degree'],
campos=debug_dict['campos'],
prefiltered=False,
debug=False
)
rasterizer = GaussianRasterizer(raster_settings=raster_settings)
means2D = screenspace_points
# Rasterize visible Gaussians to image, obtain their radii (on screen).
rendered_image, radii, rendered_depth, rendered_alpha = rasterizer(
means3D = debug_dict['means3D'],
means2D = means2D,
shs = debug_dict['shs'],
colors_precomp = debug_dict['colors_precomp'],
opacities = debug_dict['opacities'],
scales = debug_dict['scales'],
rotations = debug_dict['rotations'],
cov3D_precomp =debug_dict['cov3D_precomp'])
# Those Gaussians that were frustum culled or had a radius of 0 were not visible.
# They will be excluded from value updates used in the splitting criteria.
return {"render": rendered_image,
"viewspace_points": screenspace_points,
"visibility_filter" : radii > 0,
"radii": radii}
Here are what I do in two shells:
import diff_gaussian_rasterization
and exec(open("./debug.py").read())
2.search ps -aux | grep python
and cuda-gdb -p <pid>
in shell 2
3.run break forward.cu:497
in shell 2 and still get(cuda-gdb) break forward.cu:497
No symbol table is loaded. Use the "file" command.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (forward.cu:497) pending.
It seems that my cuda-gdb shell do not align with python process. What is wrong with it? What should I do to check it?
Hi, thanks for your wonderful work and neat cuda implementation with clear file structures.
A starting point to learn 3d gaussian splatting would be to jump into every piece of code and see what they are actually doing. Unfortunately however, the pdb from python does not support debug in cuda/c++ files. I've searched the internet and found a seemingly viable solution.
The solution is that, open two terminals, one is shell1 and the other is shell2. In shell 2, enter python shell, then in shell 1,
ps -aux | grep python
and runcuda-gdb -p PID
to attach to the python process in shell 1. Then breakpoints can be set viabreak forward.cu:429
. After setting breakpoints, runcontinue
in shell 2. Then in shell 1,import os
, andos.system('python train.py')
.The above pipeline works under projects written in pure cuda. The whole project can be compiled using
nvcc -Xcompiler -fPIC -std=c++11 -shared -arch=sm_60 -G -g -o t383.so t383.cu -DFIX
for instance.However, since 3dgs is a mix of python, c++, and cuda. I didn't know how to specify these arguments in setup.py or CMakeLists.txt files. As a result, when set breakpoints using
break forward.cu:497
, cuda-gdb complains that it can not find the source files, which I guess the problem is in linking stage?Can someone help with this?