Closed BinuxLiu closed 4 months ago
Are there any experimental results in Tokyo or SF-XL? Is it because excessive dimensionality can cause memory overflow?
Hello @BinuxLiu,
Thank you for your interest!
I will run experiments on the SF-XL dataset and report the results once the tests are completed. I did not have any problem with memory when I tested on San Francisco (more than 1M images). 12288 is still 3x smaller than NetVLAD's 32,768 dim.
I currently do not have the Tokyo dataset on my computer. I will need to ask the authors for the download link, and I may add it later.
Best, Amar
Thank you for your answer.
Awesome!
2024-07-12 22:53:09 Test set: < BaseDataset, tokyo247 - #database: 75984; #queries: 315 >
2024-07-12 22:59:39 Recalls on < BaseDataset, tokyo247 - #database: 75984; #queries: 315 >: R@1: 98.10, R@5: 98.10, R@10: 98.73, R@100: 99.68
Hello @BinuxLiu,
Thank you for taking the time to test on Tokyo247. Are these results with DinoV2-BoQ with images resized to 322x322? If so, do you permit that I put these performance on the README?
2024-07-12 23:16:09 Recalls on < BaseDataset, tokyo247 - #database: 75984; #queries: 315 >: R@1: 96.51, R@5: 97.78, R@10: 98.41, R@100: 100.00
2024-07-12 23:16:09 Finished in 0:02:59
This is the Tokyo247 results with DinoV2-BoQ with images resized to 322x322. Of course.
Nice, thank you. Could you please tell which model generated these results?
Awesome!
2024-07-12 22:53:09 Test set: < BaseDataset, tokyo247 - #database: 75984; #queries: 315 > 2024-07-12 22:59:39 Recalls on < BaseDataset, tokyo247 - #database: 75984; #queries: 315 >: R@1: 98.10, R@5: 98.10, R@10: 98.73, R@100: 99.68
I used adaptive resolution to evaluate the dataset and also used the DINO-BoQ model to test and get the first set of results. The resolution has a great impact on the results of Tokyo247 dataset. You are welcome. I am also very grateful for your previous answer. Your work are very inspiring to me.
class VPRModel(torch.nn.Module):
def __init__(self,
backbone,
aggregator):
super().__init__()
self.backbone = backbone
self.aggregator = aggregator
def forward(self, x):
if not self.training:
b, c, h, w = x.shape
h = round(h / 14) * 14
w = round(w / 14) * 14
x = transforms.functional.resize(x, [h, w], antialias=True)
x = self.backbone(x)
x, attns = self.aggregator(x)
return x, attns
@BinuxLiu,
Okay I see, thanks, Tokyo247 has high-resolution queries with varying image sizes, so it makes sense that performance improves when maintaining the original aspect ratio (we discussed this aspect in the Supplementary).
I'm glad you enjoyed our work and find it useful :) By the way, I will be launching a new framework for VPR in the coming days. It'd be great to get some feedback. I'll keep you in touch.
Best, Amar.
Hi @amaralibey Sorry, I forgot to test the performance of ResNet-50 last time. At a resolution of 322x322:
Recalls on < BaseDataset, tokyo247 - #database: 75984; #queries: 315 >: R@1: 90.48, R@5: 94.29, R@10: 96.51, R@100: 97.78
At the original resolution:
Recalls on < BaseDataset, tokyo247 - #database: 75984; #queries: 315 >: R@1: 94.29, R@5: 96.51, R@10: 96.51, R@100: 98.41
Welcome to use my results, I guess there shouldn't be any errors. If the experimental results of SF-XL datasets are available, please let me know. Tips: Save the features of the SF-XL database during the first run to speed up experiments with multiple query sets.
Hi,for ResNet-50 ,I get the result of 90.8 95.6 96.5 at a resolution of 384 ×384. Am I testing something wrong? @BinuxLiu
Hi, @LKELN I think your results are consistent with mine. First, different machines have some influence (very small). Second, the denominator used to calculate the recall of Tokyo 247 is smaller than that of other datasets (making it easier to notice the difference). Please compare the details of result, the difference is reasonable.
You may not have noticed that I'm reporting results at two resolutions.
I agree that slight fluctuations are normal, the original resolution you mention is 480×640? If your original resolution is also 384 x 384, then the fluctuation is anomalous I think.
Hi, @amaralibey BoQ is a wonderful work. My questions are: 1) What is the feature dimension of BoQ with DINOV2 used in the experiment reported in README? 2) What is the feature dimension in Table 3 of your paper? Looking forward to your reply!