visualize_hpatches_adaptation questions

rpautrat / SuperPoint

Efficient neural feature detector and descriptor

MIT License

1.87k stars 414 forks source link

visualize_hpatches_adaptation questions #275

Open johnz334 opened 1 year ago

johnz334 commented 1 year ago

Hi I want to thank you for your work. It was amazing. I was trying to run visualize_hpatches_adaptation.ipynb file in notebook

I noticed that there are 3 name in exp: exp = [ 'mp_synth-v6_photo-hom-aug_hp-v-repeat', 'mp_synth-v6_photo-hom-aug_ha2-100-0_hp-v-repeat', 'harris_hp-v-repeat', ]

what is the difference between 'mp_synth-v6_photo-hom-aug_hp-v-repeat' and 'mp_synth-v6_photo-hom-aug_ha2-100-0_hp-v-repeat'(magicpoint and superpoint)? I think I already got superpoint output but how can I generate magicpoint exp( 'mp_synth-v6_photo-hom-aug_hp-v-repeat')?

rpautrat commented 1 year ago

Hi, this is an old development notebook and these are names of old experiments. You don't have to reproduce that to use SuperPoint.

But to answer your question, ''mp_synth-v6_photo-hom-aug_hp-v-repeat' is with a single round of step 2-3 of the Readme, while 'mp_synth-v6_photo-hom-aug_ha2-100-0_hp-v-repeat' has two rounds of step 2-3.

johnz334 commented 1 year ago

Got it! Difference is 1 round and 2 round! To produce the second round I need to exporting detection on MS-COCO again but this time I will use the output from step 3(for me is magic-point_coco) instead of magic-point_synth(output from step 1). Is that right?

rpautrat commented 1 year ago

Yes, that's correct.

johnz334 commented 1 year ago

Thank you. I have another question, when training the model, I will get output like this: Iter 6000: loss 0.1089, precision 0.2415, recall 0.5023 What does those 3 number means and what equation is used in here to get loss,precision and recall?

rpautrat commented 1 year ago

Please refer to the original paper for a description of the loss: https://arxiv.org/pdf/1712.07629.pdf Precision and recall are coarse metrics to monitor the training. Precision gives the ratio of pixels that are correctly predicted as keypoints, while recall is the ratio of ground truth keypoints that were retrieved by the network prediction.