Lakonik / SSDNeRF

[ICCV 2023] Single-Stage Diffusion NeRF
https://lakonik.github.io/ssdnerf/
MIT License
432 stars 23 forks source link

Training on the custom dataset #24

Open FrozenSilent opened 1 year ago

FrozenSilent commented 1 year ago

Hi! Thanks for releasing the code of your wonderful work, including both training and inference. I want to train SSDNeRF on my own dataset and have prepared the dataset in the format of the shapenet srn dataset. However, after two days of training, the results are still not good. Here are the visualization results of one tri-plane representation. scene_0000 It can be seen that there are many noises in the feature plane, which also reflects in the final geometry reconstruction. image And the test rendering result is also not good. scene_0000_011 I use the config file ssdnerf_cars_uncond for training. So what should I pay attention to during the training process? Can you give some suggestions based on the current training results? Thank you very much!

Lakonik commented 1 year ago

Hi! I think this could be caused by imbalanced diffusion and rendering loss. It is very important in SSDNeRF to tune the relative weights of diffusion and rendering loss (for dense views the weighted losses (as printed) should be roughly 1:1). Also, I would recommend starting from the new models (e.g., ssdnerf_cars_uncond_16bit), which is more stable to train.

If you still encounter problems, you may need to check your camera pose format. In this codebase we use the OpenCV camera coordinate system, which is very different from the cameras of other rendering software (e.g. Blender) and thus requires conversion.

Lakonik commented 1 year ago

Also, I noticed another minor issue: the object seems too small in the triplane space. It is recommended to normalize the objects' scale to fit them in the triplanes. The scale can be altered through the dataloader's radius parameter.

https://github.com/Lakonik/SSDNeRF/blob/5150270b37f691fd4fcb6614fcd55caa0bf3940b/lib/datasets/shapenet_srn.py#L44

FrozenSilent commented 1 year ago

Thanks for your advice! I did use Blender to generate the dataset and modified the code that generates the rays accordingly. I will try the config file you specified for training. Also thank you very much for your suggestion on the data processing part!

FrozenSilent commented 1 year ago

With your suggestion, I tried training again, and this time the noise in the tri-plane was much less. At least I could see the approximate shape of the model clearly, but the surface was still not smooth. scene_0001 image May I ask which parameters you suggest adjusting to improve training results? In addition, I am also trying to fine-tune the pre-trained checkpoint of the shapenet table on my data, and the current results also gradually appear a lot of noise. scene_000000

Lakonik commented 1 year ago

What's the size of your training set? Ideally the batchsize should scale linearly with the number of training objects. If you are using a much larger training set without increasing the batchsize due to limited GPUs, you can try reduce the code learning rate (in train_cfg) to stabilize the optimization, although it could result in slower convergence.

harshD42 commented 4 months ago

@FrozenSilent could you please explain in detail how you generated your custom shapenet srn data? I can't find any resources for the same. I have a few computationally generated meshes that I would like to convert into the specified data format and I've written a script for it in blender. The training on the said data works but the testing fails. It would be a great help if you could breakdown your process.