NVlabs / nvdiffrec

Official code for the CVPR 2022 (oral) paper "Extracting Triangular 3D Models, Materials, and Lighting From Images".
Other
2.09k stars 222 forks source link

About the input image and some config params #90

Closed sadexcavator closed 1 year ago

sadexcavator commented 1 year ago

Hi, I have some questions about the the input image and several config params setting.

  1. As you said in #77 "Best results are obtained when the object covers most of the screen." I want to know that what the proportion of objects in the picture should be for better results? more than 50% or 75%?
  2. And what about the 'dmtet_grid' and 'mesh_scale' setting idea? Does it simply means that for the bigger object I should set bigger 'dmtet_grid' and 'mesh_scale'? . Typically what value should I set for these two params? Could you please share some idea about their setting tricks? THANK YOU!
jmunkberg commented 1 year ago

Hello.

  1. Best results are obtained if the entire object is covering the screen, but not clipping the screen borders in any of the views. So 100%. See the nerf synthetic datasets for examples.

  2. From https://github.com/NVlabs/nvdiffrec/blob/main/train.py#L502

    FLAGS.dmtet_grid          = 64                       # Resolution of initial tet grid. We provide 64 and 128 resolution grids. Other resolutions can be generated with https://github.com/crawforddoran/quartet
    FLAGS.mesh_scale          = 2.1                      # Scale of tet grid box. Adjust to cover the model

Higher values of dmtet_grid generates a more dense triangle mesh.