gmberton / deep-visual-geo-localization-benchmark

Official code for CVPR 2022 (Oral) paper "Deep Visual Geo-localization Benchmark"
MIT License
179 stars 27 forks source link

About the reproduced results on the tokyo 247 dataset #3

Closed BinuxLiu closed 2 years ago

BinuxLiu commented 2 years ago

Hi, gmberton! I trained a model using the pitts30k dataset and evaluated it on the tokyo 247 dataset. The output is as follows: 2022-09-15 09:26:21 Calculating recalls 2022-09-15 09:27:12 Recalls on < BaseDataset, tokyo247 - #database: 75984; #queries: 315 >: R@1: 55.2, R@5: 69.5, R@10: 75.2, R@20: 77.5 2022-09-15 09:27:12 Finished in 0:04:37

Here are some key parameters I use: aggregation='netvlad', backbone='resnet18conv4', mining='partial', train_batch_size=16 I guess it should be that I adjusted the batch size that caused the difference in the results.

So I downloaded the model you provided. The input is as follows: python3 eval.py --backbone=resnet18conv4 --aggregation=netvlad --resume=logs/pretrained/pitt_r18l3_netvlad_full.pth --dataset_name=tokyo247 Then I got the error: Traceback (most recent call last): File "eval.py", line 89, in <module> state_dict = torch.load(args.resume)["model_state_dict"] KeyError: 'model_state_dict' So am I typing wrong in the terminal?

And I found a small error in "README.md". In the "Pretained networks employing different backbones" and "Pretrained models with different mining methods", you seem to have entered some mismatched results. image image

ga1i13o commented 2 years ago

Hello and thank you for your interest and signaling the error. The error in checkpoint loading is due to the fact that we provide directly the model_state_dict rather than an object containing it, as are the checkpoints generated during training. I now pushed a commit fixing this issue, you should now be able to load both kinds of checkpoints.

Regarding the mismatching results in the table, thank you for spotting it, we will check and fix any mistakes.

BinuxLiu commented 2 years ago

Thank you very much!

BinuxLiu commented 2 years ago

Hi, I used two models, but I got the same recall. python3 eval.py --resume=logs/pretrained/pitt_r18l3_netvlad_partial.pth --dataset_name=tokyo247 python3 eval.py --resume=logs/pretrained/pitt_r18l3_netvlad_full.pth --dataset_name=tokyo247 So I'm guessing they are the same model?

ga1i13o commented 2 years ago

Thanks for noticing, we will check and upload the correct one. However you can reproduce results using train_batch_size of 4 triplets

BinuxLiu commented 2 years ago

Yeah, I will.

MAX-OTW commented 2 years ago

Hello, @ga1i13o ,I really appreciate for your great work.Now I want to obtain a list of the top N images retrieved for each query in the test dataset in the test stage, so as to verify the retrieval ability of the algorithms mentioned in the paper. Is it possible to add a piece of code on the basis of this benchmark to achieve this?What should I do? Can you give me some suggestions? I am always looking forward to your kind response. Best regards.

BinuxLiu commented 2 years ago

@MAX-OTW I think you just need to modify the test.py. PS: Please open a new issue, because I will be reminded by email under this issue. Thank you!

gmberton commented 2 years ago

Hi @MAX-OTW , I'm preparing a script to visualize the predictions. Please open a new issue and I'll post it there, so others will be able to find it as well, thanks