This in turn mismatches axis for normalization and causes coordinates greater than 1 as input into positional encoding.
Is this something that was done on purpose and if so, why? If it was a bug, was similar input used during training procedure?
The reason for doing it this way here is to maintain consistency with the operations in training. If I remember correctly, this way was also follow the operations of a version of gluefactory.
Hi, great work and amazing results!
I was going through Lightglue matching code, trying to understand it's api and performance and noticed reversed image sizes in matcher code:
This in turn mismatches axis for normalization and causes coordinates greater than 1 as input into positional encoding. Is this something that was done on purpose and if so, why? If it was a bug, was similar input used during training procedure?