Running training.py with mipnerf360/garden

simonobi commented 3 months ago

Hello I am new to photogrammetry. I'm studying the relevant research papers as well as the associated source code. In the latest version of your code, I added the mipnerf360/garden small dataset generated using colmap. There are some irregularities with the dataset files: dataset_readers.py line 138. The intr.model param I have has the value "OPENCV". From what I read, this corresponds to PINHOLE model so I changed the if to support this model type and I could proceed forward. Unfortunately the output of the command was:

train.py -s DATASET/garden_images_8/colmap/ -m DATASET/garden_images_8/output

Optimizing DATASET/garden_images_8/output Output folder: DATASET/garden_images_8/output [03/08 22:31:02] Reading camera 1/185Backend TkAgg is interactive backend. Turning interactive mode on. [03/08 22:31:03] Reading camera 185/185 [03/08 22:31:11] cameras extent: 4.935197496414185 [03/08 22:31:11] Converting point3d.bin to .ply, will happen only the first time you open the scene. [03/08 22:31:11] Loading Training Cameras: 185 . [03/08 22:31:14] Loading Test Cameras: 0 . [03/08 22:31:14] Number of points at initialisation : 42926 [03/08 22:31:14] Training progress: 0%| | 0/30000 [00:00<?, ?it/s] You are running using the stub version of nvrtc . You are running using the stub version of nvrtc.

Any pointers? Can you maybe provide a basic working training dataset?

Thank you.

simonobi commented 3 months ago

I obtained a working dataset but I still encounter the following error on latest code:

train.py -s DATASET/garden/ -m DATASET/garden/output Optimizing DATASET/garden/output Output folder: DATASET/garden/output [06/08 17:27:51] Reading camera 185/185 [06/08 17:27:53] cameras extent: 4.935335969924927 [06/08 17:27:53] [ INFO ] Encountered quite large input images (>1.6K pixels width), rescaling to 1.6K. If this is not desired, please explicitly specify '--resolution/-r' as 1 [06/08 17:27:53] Loading Training Cameras: 185 . [06/08 17:29:02] Loading Test Cameras: 0 . [06/08 17:29:02] Number of points at initialisation : 138766 [06/08 17:29:02] Training progress: 0%| | 0/30000 [00:00<?, ?it/s] You are running using the stub version of nvrtc . You are running using the stub version of nvrtc

Any ideas?

Thank you

BaowenZ commented 3 months ago

Hi! I'm not sure if the issue is coming from my code. Could you test the original Gaussian Splatting? If it also fails, it'll be better to reinstall the CUDA.

simonobi commented 3 months ago

Hello. I changed the dataset with mipnerf360/Truck and I encounter the following error: Exception has occurred: TypeError (note: full exception trace is shown but execution is paused at: _run_module_as_main) new() got an unexpected keyword argument 'kernel_size' File "/mnt/d/shared/RaDe-GS/gaussian_renderer/init.py", line 35, in render raster_settings = GaussianRasterizationSettings( File "/mnt/d/shared/RaDe-GS/train.py", line 126, in training render_pkg = render(viewpoint_cam, gaussians, pipe, background, kernel_size, require_coord = require_coord and reg_kick_on, require_depth = require_depth and reg_kick_on) File "/mnt/d/shared/RaDe-GS/train.py", line 306, in training(dataset=lp.extract(args), File "/home/simonobi/anaconda3/envs/nerfstudio/lib/python3.8/runpy.py", line 87, in _run_code exec(code, run_globals) File "/home/simonobi/anaconda3/envs/nerfstudio/lib/python3.8/runpy.py", line 194, in _run_module_as_main (Current frame) return _run_code(code, main_globals, None, TypeError: new() got an unexpected keyword argument 'kernel_size'

I checked the same dataset with the latest gaussian-splatting code and it works perfectly. I use exactly the same python environment.

simonobi commented 3 months ago

Hello. I have a small update: I reinstalled diff-gaussian-rasterization from submodules and the above error dissapeared but I again encounter the problem initially mentioned:

Optimizing ../DATASET/Truck/output Output folder: ../DATASET/Truck/output [07/08 18:59:53] Reading camera 251/251 [07/08 18:59:55] cameras extent: 5.849451017379761 [07/08 18:59:55] Loading Training Cameras: 219 . [07/08 19:00:04] Loading Test Cameras: 32 . [07/08 19:00:05] Number of points at initialisation : 120832 [07/08 19:00:05] Training progress: 0%| | 0/30000 [00:00<?, ?it/s]Backend TkAgg is interactive backend. Turning interactive mode on. [07/08 19:00:06]

You are running using the stub version of nvrtc . You are running using the stub version of nvrtc**

I reiterate, in the same environment, when I run the latest gaussian-splatting code I do not encounter any problems. I discovered that gaussian-splatting is using a different implementation of GaussianRasterizationSettings (your code features kernel_size, require_depth and require_coord)

Thank you.

BaowenZ commented 3 months ago

Hi! I reinstall and test the code. It works well on my computer. It looks like a CUDA version issue. And my cuda version is 12.1. The cuda driver's version is higher.

simonobi commented 3 months ago

I can confirm that it was a cuda problem. Please consider this issue solved. Thank you for your help. For whom it may concern, I used the cuda 11.8 install instructions from https://gist.github.com/garg-aayush/156ec6ddda3d62e2c0ddad00b7e66956

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-ubuntu2204.pin sudo mv cuda-ubuntu2204.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget https://developer.download.nvidia.com/compute/cuda/11.8.0/local_installers/cuda-repo-ubuntu2204-11-8-local_11.8.0-520.61.05-1_amd64.deb sudo dpkg -i cuda-repo-ubuntu2204-11-8-local_11.8.0-520.61.05-1_amd64.deb sudo cp /var/cuda-repo-ubuntu2204-11-8-local/cuda-*-keyring.gpg /usr/share/keyrings/ sudo apt-get update sudo apt-get -y install cuda

BaowenZ / RaDe-GS

Running training.py with mipnerf360/garden #43