Closed mhmd-mst closed 1 year ago
Hello, I am working on datasets for visual geolocalization and wanted to use RRT on them, I want to ask you if it is possible to use different descriptors than delg? And the variable src_positions means keypoint? also src_masks means the attention mask? And if so what is the variable attention and why wasnt it used?
Hi,
Yes, it is possible to use other descriptors, especially in-domain descriptors, i.e. pretrained for visual geolocalization in your case.
By src_positions
and src_masks
, I would guess you're referring to the variables presented in this line?:https://github.com/uvavision/RerankingTransformer/blob/c198e7e351d49a13260392b56df6b171653bb393/RRT_GLD/models/matcher.py#L33
Here, src_positions
are the x, y coordinates of each descriptor.
Also, as each image may have different numbers of descriptors, when the training and inference involve mini-batches, we may need to inform the model how many numbers of descriptors for each image should be attended to. That's why we provide variables like src_masks
. You can check how these variables are created from this function: https://github.com/uvavision/RerankingTransformer/blob/c198e7e351d49a13260392b56df6b171653bb393/RRT_GLD/utils/data/dataset.py#L32
Hello, I am working on datasets for visual geolocalization and wanted to use RRT on them, I want to ask you if it is possible to use different descriptors than delg? And the variable src_positions means keypoint? also src_masks means the attention mask? And if so what is the variable attention and why wasnt it used?