CUDA error when training

camtrik commented 6 months ago

Thank you for your nice work!

I got a cuda error when I trying to train this model. It's fine for me to train a vanilla Gaussian Splatting and some other Gaussian Splatting models. Just wonder what is going on here:

 python train.py  --source_path dataset/nerf_llff_data/horns --model_path output/horns --eval  --use_color --n_views 3
Using cache found in /home/haitian/.cache/torch/hub/intel-isl_MiDaS_master
/home/haitian/anaconda3/envs/FSGS/lib/python3.8/site-packages/timm/models/_factory.py:117: UserWarning: Mapping deprecated model name vit_base_resnet50_384 to current vit_base_r50_s16_384.orig_in21k_ft_in1k.
  model = create_fn(
Using cache found in /home/haitian/.cache/torch/hub/intel-isl_MiDaS_master
[500, 1000, 2000, 2500, 3000, 4000, 5000, 6000, 7000, 8000, 9000, 10000]
Optimizing output/horns
Output folder: output/horns [12/12 15:37:55]
Tensorboard not available: not logging progress [12/12 15:37:55]
Reading camera 62/62 [12/12 15:37:56]
6.323975610733033 cameras_extent [12/12 15:37:56]
Loading Training Cameras [12/12 15:37:56]
3it [00:02,  1.16it/s]
Loading Test Cameras [12/12 15:37:58]
8it [00:00,  9.50it/s]
Number of points at initialisation :  0 [12/12 15:38:17]
Training progress:   0%|                                                                                                                                              | 0/10000 [00:00<?, ?it/s]Traceback (most recent call last):
  File "train.py", line 281, in <module>
    training(lp.extract(args), op.extract(args), pp.extract(args), args)
  File "train.py", line 90, in training
    render_pkg = render(viewpoint_cam, gaussians, pipe, background)
  File "/home/haitian/work/NeRF/FSGS/gaussian_renderer/__init__.py", line 94, in render
    rendered_image, radii, depth, alpha = rasterizer(
  File "/home/haitian/anaconda3/envs/FSGS/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/haitian/anaconda3/envs/FSGS/lib/python3.8/site-packages/diff_gaussian_rasterization/__init__.py", line 215, in forward
    return rasterize_gaussians(
  File "/home/haitian/anaconda3/envs/FSGS/lib/python3.8/site-packages/diff_gaussian_rasterization/__init__.py", line 32, in rasterize_gaussians
    return _RasterizeGaussians.apply(
  File "/home/haitian/anaconda3/envs/FSGS/lib/python3.8/site-packages/diff_gaussian_rasterization/__init__.py", line 92, in forward
    num_rendered, color, depth, alpha, radii, geomBuffer, binningBuffer, imgBuffer = _C.rasterize_gaussians(*args)
RuntimeError: CUDA error: invalid configuration argument

zehaozhu commented 6 months ago

Hi, thanks for your interest in our work.

You need to install our gaussian rasterizer available at https://github.com/VITA-Group/FSGS/tree/main/submodules/diff-gaussian-rasterization-confidence, which is different with the original rasterizer.

camtrik commented 6 months ago

@zehaozhu Thank you! But I think I've installed it by using the script provided here(the original one doesn't work for me). But still this problem happened.

zehaozhu commented 6 months ago

The reason is that your initial point cloud is empty. I have uploaded a new version of code and you can re-try it.

zehaozhu commented 6 months ago

You can check if this file exists. If not, you may preprocess the dataset following this.

camtrik commented 6 months ago

Thanks for your help! I think maybe something wrong with my system or cuda that I cannot generate an appropriate ply file by using the file. I will try to fix it later.

ERROR: Dense stereo reconstruction requires CUDA, which is not available on your system.

zehaozhu commented 6 months ago

Hi @camtrik

We have released both the sparse and dense point cloud for Mip-Nerf360 and LLFF dataset in this link. You may use the dense point cloud for training.

If you still want to run colmap on your own data, you may follow the updated instruction. We provide a docker option where you do not need to install colmap by yourself (but cuda is needed)

Zehao

VITA-Group / FSGS

CUDA error when training #9