minghanqin / LangSplat

Official implementation of the paper "LangSplat: 3D Language Gaussian Splatting" [CVPR2024 Highlight]
https://langsplat.github.io/
Other
636 stars 63 forks source link

set include_feature=False will lead to illegal memory access #17

Closed KzZheng closed 5 months ago

KzZheng commented 7 months ago

Hi,

I have tried to set include_feature=False for the original 3DGS training, but I encounter an error after the model runs several iterations:

Training progress:   2%|█▋                                                                      | 700/30000 [00:46<29:56, 16.31it/s, Loss=0.0869420]
[CUDA ERROR] in cuda_rasterizer/rasterizer_impl.cu
Line 415: an illegal memory access was encountered [14/02 23:44:58]
An error occured in backward. Writing snapshot_bw.dump for debugging. [14/02 23:44:58]
 [14/02 23:44:58]
Traceback (most recent call last):
  File "/home/kaizhi/.conda/envs/langsplat/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/kaizhi/.conda/envs/langsplat/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/kaizhi/.vscode-server/extensions/ms-python.debugpy-2024.0.0-linux-x64/bundled/libs/debugpy/adapter/../../debugpy/launcher/../../debugpy/__main__.py", line 39, in <module>
    cli.main()
  File "/home/kaizhi/.vscode-server/extensions/ms-python.debugpy-2024.0.0-linux-x64/bundled/libs/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 430, in main
    run()
  File "/home/kaizhi/.vscode-server/extensions/ms-python.debugpy-2024.0.0-linux-x64/bundled/libs/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 284, in run_file
    runpy.run_path(target, run_name="__main__")
  File "/home/kaizhi/.vscode-server/extensions/ms-python.debugpy-2024.0.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 321, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "/home/kaizhi/.vscode-server/extensions/ms-python.debugpy-2024.0.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 135, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "/home/kaizhi/.vscode-server/extensions/ms-python.debugpy-2024.0.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 124, in _run_code
    exec(code, run_globals)
  File "train.py", line 231, in <module>
    training(lp.extract(args), op.extract(args), pp.extract(args), args.test_iterations, args.save_iterations, args.checkpoint_iterations, args.start_checkpoint, args.debug_from)
  File "train.py", line 104, in training
    loss.backward()
  File "/home/kaizhi/.conda/envs/langsplat/lib/python3.10/site-packages/torch/_tensor.py", line 487, in backward
    torch.autograd.backward(
  File "/home/kaizhi/.conda/envs/langsplat/lib/python3.10/site-packages/torch/autograd/__init__.py", line 200, in backward
    Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
  File "/home/kaizhi/.conda/envs/langsplat/lib/python3.10/site-packages/torch/autograd/function.py", line 274, in apply
    return user_fn(self, *args)
  File "/home/kaizhi/.conda/envs/langsplat/lib/python3.10/site-packages/diff_gaussian_rasterization/__init__.py", line 147, in backward
    raise ex
  File "/home/kaizhi/.conda/envs/langsplat/lib/python3.10/site-packages/diff_gaussian_rasterization/__init__.py", line 143, in backward
    grad_means2D, grad_colors_precomp, grad_language_feature_precomp, grad_opacities, grad_means3D, grad_cov3Ds_precomp, grad_sh, grad_scales, grad_rotations = _C.rasterize_gaussians_backward(*args)
RuntimeError: an illegal memory access was encountered

I can successfully train with the original 3DGS code.

zqh0253 commented 7 months ago

I also encountered this issue. Have you solved it now?

xuqinwang commented 7 months ago

I suppose it is the problem of modified cuda_rasterizer/backward.cu:495. When include_feature=False, language_feature is (1,0), and this line tries to access this variable as if its the language_feature (N_gaussian, dim_feature)

KzZheng commented 7 months ago

I just use the original 3DGS to get the original point cloud.

nhat-vo commented 6 months ago

@KzZheng @zqh0253 @xuqinwang. Have a look at https://github.com/minghanqin/langsplat-rasterization/pull/1 to see if it fixes your issue.

Another hacky workaround I found while looking for the bug is to override https://github.com/minghanqin/LangSplat/blob/850e8b94bddacb0ad1d8173cc43793a0f0338112/gaussian_renderer/__init__.py#L50 to include_feature=True, , and https://github.com/minghanqin/LangSplat/blob/850e8b94bddacb0ad1d8173cc43793a0f0338112/gaussian_renderer/__init__.py#L91 to language_feature_precomp = colors_precomp * 0.0 (with --convert_cov3D_python flag).

minghanqin commented 5 months ago

Thanks a lot! @nhat-vo I have merged your modifications into the original code.

beautifulchoi commented 5 months ago

@KzZheng @zqh0253 @xuqinwang. Have a look at minghanqin/langsplat-rasterization#1 to see if it fixes your issue.

Another hacky workaround I found while looking for the bug is to override

https://github.com/minghanqin/LangSplat/blob/850e8b94bddacb0ad1d8173cc43793a0f0338112/gaussian_renderer/__init__.py#L50

to include_feature=True, , and https://github.com/minghanqin/LangSplat/blob/850e8b94bddacb0ad1d8173cc43793a0f0338112/gaussian_renderer/__init__.py#L91

to language_feature_precomp = colors_precomp * 0.0 (with --convert_cov3D_python flag).

Thnak you so much !

garrisonz commented 4 months ago

Thanks a lot! @nhat-vo I have merged your modifications into the original code.

LangSplat/submodules/langsplat-rasterization seem bind on 329f8e8 It would be nice if git clone --recursive loads the latest submodule, which includes this fix.