Closed SYSUykLin closed 7 months ago
As far as I understand, in the rasterization process, they use shared memory for calculating the collected features/colors and for gradient calculation. The shared memory is limited by specific GPU. In this paper, they dynamically allocate a cuda array as a cache for the collected features to avoid using shared memory (of course it's the tradeoff between the need for dimension and shared memory issue). You can see the implementation here: https://github.com/ShijieZhou-UCLA/feature-3dgs/blob/9e714ff841dd90b2d3e9d29bf3d1a74b90d414a7/submodules/diff-gaussian-rasterization/cuda_rasterizer/rasterizer_impl.cu#L398 If I misunderstand, please point me out.
https://github.com/graphdeco-inria/gaussian-splatting/issues/41#issuecomment-1805179311 you can try this: adding "-Xcompiler -fno-gnu-unique" option in submodules/diff-gaussian-rasterization/setup.py: line 29 resolves the illegal memory access error in training.
extra_compile_args={"nvcc": ["-Xcompiler", "-fno-gnu-unique","-I" + os.path.join(os.path.dirname(os.path.abspath(file)), "third_party/glm/")]})
As far as I understand, in the rasterization process, they use shared memory for calculating the collected features/colors and for gradient calculation. The shared memory is limited by specific GPU. In this paper, they dynamically allocate a cuda array as a cache for the collected features to avoid using shared memory (of course it's the tradeoff between the need for dimension and shared memory issue). You can see the implementation here:
If I misunderstand, please point me out.
Thanks very very very much.
As far as I understand, in the rasterization process, they use shared memory for calculating the collected features/colors and for gradient calculation. The shared memory is limited by specific GPU. In this paper, they dynamically allocate a cuda array as a cache for the collected features to avoid using shared memory (of course it's the tradeoff between the need for dimension and shared memory issue). You can see the implementation here: https://github.com/ShijieZhou-UCLA/feature-3dgs/blob/9e714ff841dd90b2d3e9d29bf3d1a74b90d414a7/submodules/diff-gaussian-rasterization/cuda_rasterizer/rasterizer_impl.cu#L398
If I misunderstand, please point me out.
Thanks very very very much.
Thanks
Hello: I previously attempted to render features with 256 dimensions, but CUDA indicated insufficient shared memory, allowing for a maximum of only 40 dimensions to be rendered. May I ask what changes you made to enable it to render 256 dimensions?