[CVPR 2024] RoMa: Robust Dense Feature Matching; RoMa is the robust dense feature matcher capable of estimating pixel-dense warps and reliable certainties for almost any image pair.
1.(and 3) You are correct, eq. 16 is correct but forgot to flip the sign for the KL divergence, thanks.
The value we mention in the paper is in pixels, while the codebase is in normalized coordinates. At a resolution of 560 we get $560/2 10^{-4} \approx 0.03 $
Dear authors, I notice there may be some disparities between Eq.(18) and implementation:
Compared to what is defined in Eq.(18) of the paper:
there is an extra power term **2 on the cs term.
Could you kindly help to clarify? Thanks!