NVlabs / nvdiffrec

Official code for the CVPR 2022 (oral) paper "Extracting Triangular 3D Models, Materials, and Lighting From Images".
Other
2.09k stars 222 forks source link

noise in rendered results (noise in diffuse/specular?) #10

Closed wangjksjtu closed 2 years ago

wangjksjtu commented 2 years ago

Thank you for the amazing work!!

I did a quick trial on the custom in-the-wild data and found that there are some noise in fitting results. Did I mess up something? Have you observed this phonenmon before? Note that I fixed the vertex for mesh and just optimized the lighting / material.

rendered reference
test_000002_opt test_000002_ref
test_000007_opt test_000007_ref
The texture maps looks like: kd ks normal
texture_kd texture_ks texture_n

Looking forward to your great help! Thanks a lot

jmunkberg commented 2 years ago

Thanks for your interest in our work!

That looks way noisier than I would expect. A few things that come to mind:

@frankshen07 may have additional ideas to improve the quality of this type of content.

pwais commented 2 years ago

@jmunkberg Do you have any plans to test / benchmark on CO3D? ( https://github.com/facebookresearch/co3d ). DTU is great, but CO3D might be slightly better for "in the wild" benchmark. Also the NeRF drums in the paper actually look pretty decent, you have to really really zoom in on the figure to see the artifacts.

JHnvidia commented 2 years ago

@wangjksjtu Thanks for trying the code out. I'm speculating that @jmunkberg's point with fixed lighting is the likely problem, especially since the car model has such a strong specular highlight. If there are inconsistencies in lighting, the system may try to cheat using any other shading component.

We use texture mip-mapping, so it could insert a noise like pattern that is less visible when averaged (viewed from a higher angle or larger distance) to fake a particular reflection. It looks like what's happening for the driver door window, which has a pretty visible fake highlight in the second image, but is is considerable fainter in the first.

Unfortunately, supporting varying lighting would require significant code updates, adding a unique trainable light probe for each dataset example. We currently perform the splitsum precomputations just once per batch, so it would also impact performance.

@pwais We briefly checked out CO3D while working on the paper but decided against it. Compared to the datasets we use in the paper, CO3D is considerably harder with fewer views and more corrupted segmentation masks. I think you'd need to be able to generalize accross object classes to generate acceptable meshes (so adding networks / building a larger system). While it's an interesting direction it is a project in itself.

wangjksjtu commented 2 years ago

@jmunkberg @JHnvidia Thanks a lot for the prompt and detailed reply!! It is very helpful.

Agree that the cause might be the lighting inconsistencies across different views and cameras. Forgot to mention that the lighting is training with front cameras (around 20 views) and testing on front-left cameras. This high-freq noise will also be more significant at testing views and especially significant when the objs are far away. As suggested, the model is kind of cheating to minimize the loss - by using other components in shading model (e.g., specular texture map). Without modeling time varying lighting, it might be diffcult to decompose, also there might be some ambiguity in the texture/lighting so it is hard to decouple.

Btw, would you mind sharing more thoughts on the lighting regularizer used in the paper. It seems to encourage the white lights like the following example. Does it mean that it cannot capture the colored surrounding buildings (so it will be baked into diffuse/specular maps)? image

Thanks for the great answers again!

pwais commented 2 years ago

@JHnvidia Thanks for confirming you tried at least! I know the CO3D data itself is inconsistent in places. Your paper showed masks from Detectron so it's interesting to know.

jmunkberg commented 2 years ago

@wangjksjtu The light regularizer is discussed in Section 9.1 of the paper https://nvlabs.github.io/nvdiffrec/assets/paper.pdf see Eq. 8, and is indeed designed to penalize color shifts, which worked well for us in e.g., the NeRF dataset. See e.g., Fig 30.

You can easily tweak its influence if you think it is too colorless. In Figure 11, we jointly train materials and lighting (with geometry fixed) on a synthetic scene with many views (IIRC the light regularizer is disabled for that particular example), and the probe looks ok.

A similar example (but with the light regularizer enabled) is included in the config https://github.com/NVlabs/nvdiffrec/blob/main/configs/spot_metal.json

JHnvidia commented 2 years ago

@wangjksjtu The light regularizer is quite ad hoc. The rationale is that natural scenes contain mostly white light, and we wanted to force most of the chroma information into the materials. In particular for diffuse materials, it's undetermined if something comes from lighting, shadowing, or the kd texture.

I wouldn't consider the regularizers a gold standard, but rather something that worked over our range of test scenes. Feel free to tweak them as you see fit. However, if your dataset is just 20 views and happen to break our assumptions (static distant environment lighting with not so much shadowing / interreflections) then I would expect the light probe quality to suffer, so I think it will be hard for you to see the buildings.

wangjksjtu commented 2 years ago

@jmunkberg @JHnvidia Thanks for the detailed reply, pointers and suggestions! It makes a lot of sense to me. really appreciate it ;)