yxgeee / OpenIBL

[ECCV-2020 (spotlight)] Self-supervising Fine-grained Region Similarities for Large-scale Image Localization. 🌏 PyTorch open-source toolbox for image-based localization (place recognition).
https://yxgeee.github.io/projects/sfrs
MIT License
271 stars 41 forks source link

About reproduction #2

Closed zhangpj closed 4 years ago

zhangpj commented 4 years ago

Hi, thank you for sharing this project. Good job! I tried to run this project, but there are some questions that confuse me.

1 When runningtrain_sfrs_dist.sh, Loss_hard and Loss_soft are like follows: Loss_hard << soft-weight(0.5)*Loss_soft, so, does Loss_hard make a small contribution or even negligible? and Loss_soft does not seem to converge, have you ever seen a similar phenomenon when you train the network?

Epoch: [4-7][160/320]   Time 0.672 (0.674)  Data 0.069 (0.077)  Loss_hard 0.018 (0.052) Loss_soft 1.749 (2.275)
Epoch: [4-7][170/320]   Time 0.670 (0.672)  Data 0.065 (0.076)  Loss_hard 0.041 (0.050) Loss_soft 2.780 (2.272)
Epoch: [4-7][180/320]   Time 0.671 (0.671)  Data 0.063 (0.075)  Loss_hard 0.015 (0.049) Loss_soft 1.535 (2.251)
Epoch: [4-7][190/320]   Time 0.665 (0.670)  Data 0.063 (0.074)  Loss_hard 0.005 (0.049) Loss_soft 1.572 (2.239)
Epoch: [4-7][200/320]   Time 0.666 (0.669)  Data 0.060 (0.073)  Loss_hard 0.019 (0.048) Loss_soft 2.144 (2.230)
Epoch: [4-7][210/320]   Time 0.667 (0.668)  Data 0.063 (0.073)  Loss_hard 0.022 (0.049) Loss_soft 2.122 (2.247)
Epoch: [4-7][220/320]   Time 0.658 (0.668)  Data 0.055 (0.072)  Loss_hard 0.005 (0.048) Loss_soft 1.374 (2.239)
Epoch: [4-7][230/320]   Time 0.504 (0.667)  Data 0.047 (0.071)  Loss_hard 0.028 (0.047) Loss_soft 1.855 (2.239)
Epoch: [4-7][240/320]   Time 0.665 (0.667)  Data 0.061 (0.071)  Loss_hard 0.201 (0.048) Loss_soft 3.224 (2.247)
Epoch: [4-7][250/320]   Time 0.668 (0.666)  Data 0.063 (0.070)  Loss_hard 0.001 (0.047) Loss_soft 1.920 (2.239)
Epoch: [4-7][260/320]   Time 0.660 (0.666)  Data 0.068 (0.070)  Loss_hard 0.037 (0.047) Loss_soft 2.350 (2.240)
Epoch: [4-7][270/320]   Time 0.658 (0.666)  Data 0.062 (0.069)  Loss_hard 0.068 (0.047) Loss_soft 3.046 (2.240)
Epoch: [4-7][280/320]   Time 0.717 (0.668)  Data 0.060 (0.069)  Loss_hard 0.019 (0.048) Loss_soft 2.411 (2.233)
Epoch: [4-7][290/320]   Time 0.693 (0.669)  Data 0.060 (0.068)  Loss_hard 0.096 (0.048) Loss_soft 3.048 (2.247)
Epoch: [4-7][300/320]   Time 0.669 (0.670)  Data 0.059 (0.068)  Loss_hard 0.091 (0.049) Loss_soft 3.546 (2.255)
Epoch: [4-7][310/320]   Time 0.669 (0.670)  Data 0.064 (0.068)  Loss_hard 0.014 (0.049) Loss_soft 2.299 (2.247)
Epoch: [4-7][320/320]   Time 0.629 (0.669)  Data 0.026 (0.067)  Loss_hard 0.057 (0.048) Loss_soft 3.039 (2.261)

2 The results on pitts250K of the best model in my reproduction are slightly lower than the results of your paper.89.8% | 95.9% | 97.3% VS 90.7% | 96.4% | 97.6%. The best model in my reproduction is output of 5th epoch of the third generation, instead of convergence at the fourth generation as mentioned in the paper. Is the best model the output of the last iteration when you train?

3 I only use one GPU (2080ti), the other parameters are default. I don't Know if the inferior results are duo to too few GPUs, or is there something else I need to pay attention to?

yxgeee commented 4 years ago
  1. The loss seems normal. The convergence may be slow in later epochs.
  2. The best model selected by validation results may not achieve the optimal performance on the test set. The model reported in the paper was selected from the last epoch of the 4th generation. Since there may exist training randomness, it is recommended to test the five checkpoints in the last generation and choose the best-performing one.
  3. If you use the default settings on one GPU, only one triplet will be adopted for training in each mini-batch. Try to modify --tuple-size in the training scripts to adopt more triplets on one GPU for training. In my experiments, I adopted 4 GPUs and one triplet on each GPU, thus a batch of 4 triplets is used. In case the GPU memory is not enough for 4 triplets on only one 2080TI, maybe you need to decrease the learning rate to fit your batch size.
zhangpj commented 4 years ago

@yxgeee All right, thanks for your suggestion, I will have a try.

zhangpj commented 4 years ago

@yxgeee Hi, in your paper, you also evaluated SFRS on Oxford 5k, Paris 6k and Holidays datasets, can you share source codes that evaluate SFRS on those datasets, or tell me how do you evaluate SFRS on Oxford 5k, Paris 6k and Holidays datasets?

yxgeee commented 4 years ago

My colleague helps me test SFRS on retrieval datasets, and maybe I will merge these code in this repo after re-organization. We strictly follow all the same settings (e.g. image size, augmentation, etc.) as SARE and NetVLAD. So you could also refer to their code for evaluation details.

zhangpj commented 4 years ago

Ok, thank you.