google / nerfactor

Neural Factorization of Shape and Reflectance Under an Unknown Illumination
https://xiuming.info/projects/nerfactor/
Apache License 2.0
437 stars 56 forks source link

Crash at shape pre-training #28

Closed Woolseyyy closed 1 year ago

Woolseyyy commented 1 year ago

I am trying to reproduce the results but meet some problems at 'I. Shape Pre-Training'. I find the script would crash at validation of shape pre-training. It looks like a OOM issue because log says "killed" and it stop crashing if I set shuffle_buffer_size=False at shape.ini. Any suggestions would help!

I am using a machine with 4 3090 GPUs, 12 cpu cores and 60 GB memory. My dataset have 100 train data and 7 validate data. There are 120 test data, 99 train data, 99 val data at surf_root directory.

xiumingzhang commented 1 year ago

Try reducing the MLP chunk size? That should solve your OOM problem.

Also, consider trying out https://github.com/nerfstudio-project/nerfstudio as a drop-in replacement for NeRFactor's "Shape Pre-Training." nerfstudio will be much faster, and NeRFactor doesn't care who generated the surface points.

Woolseyyy commented 1 year ago

Why dose MLP chunk size affect cpu memory? It seems that mlp chunk size is only about GPU memory.

wangmingyang4 commented 1 year ago

hi! I have encountered the same problem. How did you solve it? @Woolseyyy

Woolseyyy commented 1 year ago

I set shuffle_buffer_size=False

I set cache=False for validation and set shuffle_buffer_size=False

wangmingyang4 commented 1 year ago

Does this setting affect the overall experiment?

Woolseyyy commented 1 year ago

Does this setting affect the overall experiment?

modify the code to make it only affect validation

wangmingyang4 commented 1 year ago

I found no shuffle_buffer_size in shape.ini, do I need to add shuffle_buffer_size = 0, or no_shuffle = True in modify base.py? @Woolseyyy An error : buffer_size must be greater than 0 occurred when setting shuffle_buffer_size = 0.