google-research / multinerf

A Code Release for Mip-NeRF 360, Ref-NeRF, and RawNeRF
Apache License 2.0
3.58k stars 339 forks source link

About training time #17

Closed StarsTesla closed 1 year ago

StarsTesla commented 1 year ago

Hi there, I downloaded the code and installed the dependecy, I use the 360_v2/room dataset for training, and I found the training time is extremely long. I train use the original train_360 shell script in the code, only changed the data_dir and result dir. I recall the paper says the Mip-NeRF 360 should 2x slow compared to the original NeRF or Mip-NeRF. and I trained the mip-nerf_pl which is the pytorch lighting version, and it took almost 20-30 hrs on single 3090 GPU 24GB, but this code of Mip-NeRF 360, I trained on 4 3090 GPU 24GB, it takes 3mins-4mins per 100 steps, for 250k steps, it should be over 120hrs, and that's around a week time, I wonder why is the training speed got that? Jax version I beileve should be faster than pytorch version.

StarsTesla commented 1 year ago

Sorry about that, I just ask server provider solve the problem,the speed have already up to 53552 r/s from recent 8400 r/s , which will takes 20hrs for the training process, thanks for the research you have made for us!

theFilipko commented 1 year ago

Hello. What are the ways to speed up the training? I have the latest version of the repo running the 'train_360.sh' script on a single GPU RTX3090 24GB, batch size 4096. It does 6300 r/s, in other words, 100 steps per minute. Thanks