Open aaiguy opened 1 year ago
I wonder why this happens. Shouldn't DINOv2 be rotationally invariant?
Why?
I read vision transformers are rotational invariant , so I assumed dinov2 is same
I don't think invariance to rotation is necessarily a desirable feature. That would depend on the application. If you need this invariance, you could probably solve the issue by:
On a side-note:
Or did you mean that the retrieval system fails with 180° rotation but does well with vertical flips? That is a bit weird.
I meant upside down image that is 180 deg , i've edited my question. thanks for pointing
I am currently developing an image retrieval system to search for similar images within a large trained dataset. When I perform a query using a regular image with the trained model, I obtain almost identical images that are related to the query image. However, when I attempt the same query image after rotating it upside down, I receive completely dissimilar images that do not resemble the query image as shown below. query image in normal view similar image for normal view query image
query image for rotated view Similar image for rotated view query image
Testing with vertical flipping and slight degree rotations of the image yields more relevant similar images, but the results degrade significantly when the query image is rotated completely. I wonder why this happens. Shouldn't DINOv2 be rotationally invariant?
If you wanna try this by yourself you can go to this link