tjiiv-cprg / EPro-PnP

[CVPR 2022 Oral, Best Student Paper] EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation
https://www.youtube.com/watch?v=TonBodQ6EUU
Apache License 2.0
1.09k stars 104 forks source link

Norm Factor Intuition #46

Open korizona opened 1 year ago

korizona commented 1 year ago

Hello, Firstly i wanna thank you for the excellent work! I have three questions regarding the norm factor(Global scaling) learned in EPro-PnP-6DoF's rotation head.

  1. What's the reason of learning it in the first place? The paper stated that it is a global concentration of predicted pose distribution but I'm still not sure what that means. Is it some kind of mean or median?
  2. What's the effect of including it on the Monte Carlo Pose Loss? I saw that you use it to divide the loss in the code
  3. Is it possible to obtain the norm factor aside from learning?

Thank you very much!

Lakonik commented 1 year ago

Hi! Thanks for the questions. This is actually a bit trivial although it should have been addressed more clearly in the paper.

  1. Because the corresponding weights (w2d) are normalized via softmax (for stable training), we need a another global scaling factor to recover the scale of w2d. This scale affects the entropy of the distribution, sometimes referred to as the temperature parameter.
  2. The loss is divided by the norm_factor so that the magnitude of gradients are not very sensitive to the varying scaling factor, which also helps training.
  3. Yes. Theoretically you could find an optimal temperature for each distribution by minimizing the Monte Carlo pose loss during training (which could introduce some overhead). But this way you won't be able to predict the pose uncertainty during inference (although the optimal pose is not affected at all).