autonomousvision / handheld_svbrdf_geometry

On Joint Estimation of Pose, Geometry and svBRDF from a Handheld Scanner, CVPR2020
Other
59 stars 6 forks source link

Explanation on MaterialWeightSmoothnessLoss #2

Closed naoto0804 closed 2 years ago

naoto0804 commented 4 years ago

In eq11, the weights are computed like w_{q} = max_{i}acos^{-1}(n_{q} \dot h^{i}_{q)). However, when I see the code for MaterialWeightSmoothnessLoss here it seems there's no operation to compute them. Could you describe a bit more about this part?

simon-donne commented 4 years ago

Hello. Indeed; in further follow-up research we have found this internal product (it basically states how close a given observation was to observation the perfect reflection direction, and was supposed to inform the algorithm how valuable that information is to the specularity estimate) not to affect the final results noticeably. Sadly, this was after the camera-ready deadline and we have opted to not amend the paper after that to prevent multiple paper versions floating around.

However, you are right that there is a discrepancy here. I have now noted this explicitly in the source: https://github.com/autonomousvision/handheld_svbrdf_geometry/commit/41218b0546e7386229b87c94d528cd193127acff

naoto0804 commented 4 years ago

I see. Thank you so much for your reply!

HsiangYangChu commented 2 years ago

You describe Material Smoothness in your paper, but use Material Weight Smoothness in your code. Are these two the same thing? How is equation 11 implemented in the code? Looking forward to your reply!

simon-donne commented 2 years ago

Equation 11 comprises two terms. The former is the material weight smoothness (spatially). The paper talks about the concept of "material smoothness", which we want to enforce. The way we actually implement this (i.e. do this in the formula) is by adding a regularization term for the material weights to be smooth. It is implemented here: https://github.com/autonomousvision/handheld_svbrdf_geometry/blob/master/code/losses.py#L105-L130. As you can see, we call the permutohedral filter (references see the paper) to obtain spatially smoothed versions of the weights and then penalize the L1 distance to those.

The second part of equation 11 is the material sparsity term (to guide the material weight vector to be one-hot). It is implemented here: https://github.com/autonomousvision/handheld_svbrdf_geometry/blob/master/code/losses.py#L96-L102. Note that we add 2 to this term so that the absolute value is always positive, tending to zero as we perfectly fulfill the regularizer. This offset does not affect the optimization, but means that we can meaningfully visualize on a log-scale.

simon-donne commented 2 years ago

I closed this issue because I forgot to do so last time, and I hope this answered your question. Feel free to reopen as necessary.

HsiangYangChu commented 2 years ago

Thanks@simon-donne, But I still have a question: What is the range of values ​​for q in equation 11?

simon-donne commented 2 years ago
HsiangYangChu commented 2 years ago

Sorry, I'm still a little bit confused. Aren't p and q location information? Sorry, I didn't make myself clear. In other words, what is the algorithm complexity of equation 11?

simon-donne commented 2 years ago

You are of course correct -- I've edited away my previous answer. p and q are indeed spatial locations. For both terms in Eqn 11 their range is in theory unlimited. For the first term, with the bilateral weights, this quickly falls off and there is no benefit in taking it more than ~3 times the sigma of the gaussian kernel of that bilateral filter. In the permutohedral lattice, this cut-off works slightly different from just a hard cut-off, I suggest reading the paper for full details. For the second term, on material sparsity, the sum goes over the entire image -- we take the full average of the weights of all surface points.

HsiangYangChu commented 2 years ago

Thanks for your patience! Do you mean that the location is the on the image? If so, why is the location in the https://github.com/autonomousvision/handheld_svbrdf_geometry/blob/master/code/losses.py#L113 is Nx3?

simon-donne commented 2 years ago

The locations for the bilateral weights (p and q) are 3D locations (we represent the scene as a depth image, so X and Y for the pixel location, and Z being our current depth estimate for a given pixel). For flexibility and ease of use, we just have all of the points in a single vector, hence Nx3.

HsiangYangChu commented 2 years ago

Well, it seems that I understand correctly :)! Actually, I have applied this wonderful regularizer on my triangular mesh, but I found that it does not converge :(. Can you provide some suggestions? Thanks again!!!!