crockwell / far

[CVPR 2024 - Highlight] FAR: Flexible, Accurate and Robust 6DoF Relative Camera Pose Estimation
https://crockwell.github.io/far/
101 stars 6 forks source link

loss function input #4

Closed wuqun-tju closed 4 months ago

wuqun-tju commented 5 months ago

hello,

in the code here, rot_loss's input (data['R']) is 6 x 1, but here , the Input R is 3x3, it looks like they are not the same dim. I appreciate your response!

crockwell commented 5 months ago

The code runs, no? I think it is transformed from rotation matrix to 6D coords and back at some point. Checkout https://arxiv.org/abs/1812.07035 if you're curious about the differences.

wuqun-tju commented 5 months ago

Thank you , I found it is set 'rot_6d_loss' in config rot6d_trans_with_loftr.yaml , not rot_frobenius_loss in default.yaml

wuqun-tju commented 5 months ago

I read the paper you post, in the paper , the R‘s 3rd colum can be dropped image but in your code, R's 3rd row is dropped. the reason maybe is R.T = R ? could you explain this difference for me, Thanks!

wuqun-tju commented 5 months ago

Sorry for another question, the rot_6D_loss function just compute 6D vector l1_loss between Rgt and R_6d,does it make sense? could you help understand it , Thank you very much

crockwell commented 4 months ago

Agreed -- the 6D loss and converting to a rotation matrix from only the first two rows (hence the "6D") is the recipe prescribed in https://arxiv.org/abs/1812.07035

To summarize, a 6D representation is continuous and can be mapped to a rotation matrix, which makes optimization easier, compared to training directly on nonlinear output of e.g. quaternions. If you're curious for a more in-depth analysis of their method, I'd recommend reaching out to the authors of the original work.