Cannot reproduce the evaluation results on YFCC dataset

Uchan1996 commented 1 year ago

@feixue94 Hi, thank you for your great work! I want to reproduce the evaluation results on YFCC dataset to understand your model. I ran the code to evaluate it on YFCC dataset as the instruction of README and got the results below. Why are the results different from the results in your paper? Are they related to some settings?

Evaluation Results (mean over 4000 pairs):	AUC@5	AUC@10	AUC@20	AUC@50	Prec	MScore	Mkpts	Ikpts
29.95	48.95	66.77	82.80	86.06	13.82

It 1 with 0.00 It 2 with 0.00 It 3 with 0.00 It 4 with 0.00 It 5 with 0.00 It 6 with 0.00 It 7 with 0.00 It 8 with 0.00 It 9 with 0.00 It 10 with 0.00 It 11 with 0.00 It 12 with 0.00 It 13 with 0.00 It 14 with 0.00 It 15 with 1.00 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████ Results of model IMP on yfcc dataset (iterative: False, sinkhorn: True, uncertainty: False)

feixue94 commented 1 year ago

Hi, I run the evaluation script (python3 -m eval.eval_imp --matching_method IMP --dataset yfcc) again and get the following results:

Evaluation Results (mean over 4000 pairs): AUC@5 AUC@10 AUC@20 AUC@50 Prec MScore Mkpts Ikpts 38.87 58.78 74.79 87.54 87.20 23.56
It 1 with 0.00 It 2 with 0.00 It 3 with 0.00 It 4 with 0.00 It 5 with 0.00 It 6 with 0.00 It 7 with 0.00 It 8 with 0.00 It 9 with 0.00 It 10 with 0.00 It 11 with 0.00 It 12 with 0.00 It 13 with 0.00 It 14 with 0.00 It 15 with 1.00 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4000/4000 [14:31<00:00, 4.59it/s] Results of model IMP on yfcc dataset (iterative: False, sinkhorn: True, uncertainty: False)

feixue94 commented 1 year ago

I figured it out - I commented the nms in superpoint and now it is fixed. you can extract spp features for yfcc dataset and do the evaluation again.

Uchan1996 commented 1 year ago

I was able to reproduce similar results below. Thank you for fixing the code!

Evaluation Results (mean over 4000 pairs): AUC@5 AUC@10 AUC@20 AUC@50 Prec MScore Mkpts Ikpts 38.55 58.39 74.51 87.56 87.20 23.07 It 1 with 0.00 It 2 with 0.00 It 3 with 0.00 It 4 with 0.00 It 5 with 0.00 It 6 with 0.00 It 7 with 0.00 It 8 with 0.00 It 9 with 0.00 It 10 with 0.00 It 11 with 0.00 It 12 with 0.00 It 13 with 0.00 It 14 with 0.00 It 15 with 1.00 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4000/4000 [12:23<00:00, 5.38it/s] Results of model IMP on yfcc dataset (iterative: False, sinkhorn: True, uncertainty: False)

feixue94 / imp-release

Cannot reproduce the evaluation results on YFCC dataset #2