google-research / nerf-from-image

Shape, Pose, and Appearance from a Single Image via Bootstrapped Radiance Field Inversion
Apache License 2.0
377 stars 18 forks source link

The training time seems too long #2

Closed shenzy08 closed 1 year ago

shenzy08 commented 1 year ago

I run the code, and observe that it takes over 5 days to train the unconditional generator with 300k iterations, and over 1day for the hybrid inversion procedure. I employ 4 A100 with 40GB. The training time seems too long. I wonder how long do you run this code. Did I miss some settings?

dariopavllo commented 1 year ago

Hi,

The time to train the unconditional generator is about right. For the hybrid inversion procedure, training the SegFormer encoder should take 1-2 days, but this is something that is only done once and after that the inversion procedure should be relatively fast - I would say in the order of hours (depending on the size of the dataset).

If the encoder is already trained (or if you are using the pretrained models), running the inversion step a second time should be much faster. You can also try to increase the batch size (as the inversion results are independent of the batch size) as well as the learning rate gain via --inv_gain_z (you can increase it up to 20 as shown in the paper, while the default is 5).

shenzy08 commented 1 year ago

Thanks for your response. Could you provide the pretrained models of unconditional generator and encoder?

dariopavllo commented 1 year ago

https://github.com/google-research/nerf-from-image/releases