xuelunshen / gim

GIM: Learning Generalizable Image Matcher From Internet Videos (ICLR 2024 Spotlight)
https://xuelunshen.com/gim
MIT License
464 stars 19 forks source link

Understanding the normalization in Lightglue matcher #13

Open stepstefan opened 3 months ago

stepstefan commented 3 months ago

Hi, great work and amazing results!

I was going through Lightglue matching code, trying to understand it's api and performance and noticed reversed image sizes in matcher code:

        size0 = data["resize0"][:, [1, 0]]
        size1 = data["resize1"][:, [1, 0]]
        kpts0 = normalize_keypoints(kpts0, size0).clone()
        kpts1 = normalize_keypoints(kpts1, size1).clone()

This in turn mismatches axis for normalization and causes coordinates greater than 1 as input into positional encoding. Is this something that was done on purpose and if so, why? If it was a bug, was similar input used during training procedure?

xuelunshen commented 3 months ago

The reason for doing it this way here is to maintain consistency with the operations in training. If I remember correctly, this way was also follow the operations of a version of gluefactory.