Relighting metrics - Githubissues

SJoJoK commented 1 year ago

Hello, thanks for your amazing work.

I wonder how you compute the relighting metrics in the paper (table 1 and table 4). I use the relighting script in https://github.com/NVlabs/nvdiffrecmc/issues/14 and compare script in https://github.com/NVlabs/nvdiffrecmc/issues/20. However, the compare script only concerns $k_d$. I modified it to compare the relighting renderings with the GT. The psnr is higher (better) when I use the scale factor (for example, the hotdog scene of nerfactor dataset, the psnr presented in table 4 is 30.0, my re-produce results is 28.25 if scale is used, and 25.46 if not), but i think the scale factor should only be used for materials but not the renderings. So I wonder is the scale factor was used or not in computing the relighting metrics.

Thanks for any help and advise.

SJoJoK commented 1 year ago

I think I find the reason, the relighting metrics are computed from the renderings with scaled albedo, but not the estimated albedo. So i should multiply the scale factor with albedo in blender but not multiply it with the renderings in the compare script...

JHnvidia commented 1 year ago

Hello @SJoJoK,

For the paper we rescaled the re-lit images to match the average intensity of the reference. As you note it's not ideal, and there was some discussions about this during rebuttal (I think the review & rebuttal is public on NeurIPS).

The problem is that we only have a single (static) lighting observation for a scene, which creates an indeterminable scaling coefficient between light and material. I.e. it's impossible to determine if Kd = 1 & L = 1, Kd = 0.1 & L = 10, or Kd = 0.01 & L = 100, etc. This is additionally complicated if there's view dependent lighting.

Quantifying the material / relighting quality is an open (and hard) problem. It is quite intutive that an intensity bias is much preferable to e.g. baking shadows into the albedo texture, but even a small global bias can have a disproportionately large impact on PSNR.

Our evaluation in this regard is a (limited) "best effort", and the rescaling is motivated by global rescaling being reasonably easily achieved by a human in the loop. We did image rescaling, rather than just Kd, for simplicity and to account a bit better for view dependent shading. This would correspond more closely to rescaling the intensity of the light probe.

SJoJoK commented 1 year ago

Hello @JHnvidia , many thanks for your explanation and provided information. So may I inferred that the metrics are computed from the "re-scaled" image rendered (i.e. re-lit) with "non-scaled" albedo? It's important to me because i'm currently conducting some experiments in my work. Thanks again for you help.

JHnvidia commented 1 year ago

That is correct, the image is rendered per usual (non-scaled albedo) and the final result is re-scaled. Continuing the discussion per mail, please check the e-mail listed in your profile.

yehonathanlitman commented 11 months ago

I'm struggling to replicate the results in Table 4. For example, I get the same number as @SJoJoK when comparing only the GT albedo and the predicted albedo (28.2) when using the scale, but I get 25.1 when I multiply the final relit images by the same scale. I'm using the image rendered by the nvdiffrast renderer instead of Blender. Furthermore, I'm a bit confused by the discrepancy between the results presented in your rebuttal - https://openreview.net/forum?id=VAeAUWHNrty - and the ones in Table 4.

Does Table 4 PSNR correspond to Pred image vs. GT image or Pred albedo vs. GT albedo? Am I supposed to render the images in Blender and then rescale?

Thanks in advance

jmunkberg commented 11 months ago

Hello @yehonathanlitman

I think this is the script we used for that relighting comparion. @JHnvidia can likely provide more details if needed. I hope this helps!

https://github.com/NVlabs/nvdiffrecmc/issues/20#issuecomment-1404892278

yehonathanlitman commented 11 months ago

Thanks @jmunkberg, I did use the script in #20 but I still need a clarification,

Does Table 4 PSNR correspond to Pred image vs. GT image or Pred albedo vs. GT albedo (i.e. as in the script)? Am I supposed to render the images in Blender and then rescale (and then use the images in the script)?

JHnvidia commented 11 months ago

@yehonathanlitman, I moved the discussion to email, please check your university mail.

Lizb6626 commented 1 month ago

I encountered similar problems when reproducing Table 4. I use the script in #14 to obtain re-lit images and modify the script in #20 to compute the PSNR values for the predicted re-lit images compared to the GT re-lit images. Thanks a lot.

NVlabs / nvdiffrecmc

Relighting metrics #25