Understanding the normalization in Lightglue matcher

xuelunshen / gim

GIM: Learning Generalizable Image Matcher From Internet Videos (ICLR 2024 Spotlight)

MIT License

464 stars 19 forks source link

Hi, great work and amazing results!

I was going through Lightglue matching code, trying to understand it's api and performance and noticed reversed image sizes in matcher code:

        size0 = data["resize0"][:, [1, 0]]
        size1 = data["resize1"][:, [1, 0]]
        kpts0 = normalize_keypoints(kpts0, size0).clone()
        kpts1 = normalize_keypoints(kpts1, size1).clone()

This in turn mismatches axis for normalization and causes coordinates greater than 1 as input into positional encoding. Is this something that was done on purpose and if so, why? If it was a bug, was similar input used during training procedure?

xuelunshen / gim

Understanding the normalization in Lightglue matcher #13