lzqsd / InverseRenderingOfIndoorScene

MIT License
292 stars 33 forks source link

Setup of Coordinate System of Hemisphere used in Illumination #10

Closed longbowzhang closed 2 years ago

longbowzhang commented 2 years ago

Hi @lzqsd Very amazing work which inspires me quite a lot! But I still have several questions, and I hope you could kindly clarify them.

  1. The x and y axis of the coordinate system of hemisphere used in illumination. Why use L1 (p=1) normalization instead of L2, and, why get camx by reversing the cross product of camy and normalPred? https://github.com/lzqsd/InverseRenderingOfIndoorScene/blob/eeac1e960f0ba6be98afcbfcaa39c23e2dfcd4f1/models.py#L480

  2. I have also a general question about the linear space and gamma space. The input image I is gamma-corrected LDR, and the output of the render equation is the radiance in linear space. Thus I am wondering whether it makes sense to compute the render loss, although with a scale factor, as shown in Eqn. (7) in the paper?

Looking forward to your reply, and thanks a lot in advance!

lzqsd commented 2 years ago

Hi @longbowzhang,

Sorry for the very late reply!

  1. This is a bug in the code. You are totally right. It should be L2 normalization. Not sure why there is a p=1 in the first place. I have updated the code. Thanks a lot for pointing this out!
  2. The input image is loaded by loadHdr function and is there in linear space :).
longbowzhang commented 2 years ago

Hi @lzqsd,

Thanks a lot for your reply.

  1. But I still fail to figure out the setup of the coordinate system of hemisphere based on the following code. https://github.com/lzqsd/InverseRenderingOfIndoorScene/blob/64baec4a03498f3ab7b60d98ed79be27ebf78e27/models.py#L480 normalPred will be the z-axis, and camy will the y-axis, right? If so, the x-axis (i.e., camx) should be the cross product of camy and normalPred. But why reversing the result of cross product by adding a minus sign - after normalization?

  2. During inference, the input should be a gamma-corrected LDR image, right? But during training, a HDR image is used as input. I am just wondering how to deal with this mismatch?

Looking forward to your reply!

lzqsd commented 2 years ago

Hi @longbowzhang ,

As for the first question, it is just how we define the coordinate system. There is no specific reasons for that.

We actually clip the HDR image into the range of 0 to 1 during training, to make sure that training and testing can match.