Open zhaotingy opened 2 years ago
Thanks for your interest again!
Since we only take the average of all non-zero triplet loss terms. The loss may seem to jitter around the margin value while subsequent training will still improve the results. Meanwhile, the hard example mining distance threshold needs to be large at the beginning and decreases slowly in training.
The codes are written several years ago and training parts still have a large space for improvements, like: 1) training with a fixed detector instead of simply using a harris corner detector. 2) training on image pairs with known poses and depths like MegaDepth or ScanNet, instead of using only the homography for training.
Again, very nice paper. I am training GIFT from scratch (joint training of both extractor and embeder). But I noticed that it converges super fast with less than 10k iterations and accuracy no longer further improves. Just want to see if you observe similar results or any idea?