Why do L2Norm before GeM

gmberton / deep-visual-geo-localization-benchmark

Official code for CVPR 2022 (Oral) paper "Deep Visual Geo-localization Benchmark"

MIT License

186 stars 28 forks source link

Why do L2Norm before GeM #1

Closed leefeng001 closed 2 years ago

leefeng001 commented 2 years ago

Hi, thanks for this nice work first! I just confused by one thing: why you using L2Norm before GeM? I had also study the architecture proposed in original GeM paper inwhich the author was normalize the final vector instead of before pooling layer. so have you ever benchmarking the performance between using L2Norm before and after pooling layer? Looking forward to your reply!

gmberton commented 2 years ago

Hi, thanks for the good question! In our preliminary experiments, we saw that using L2Norm before GeM gives slightly better results. Note that you can easily change the L2Norm position with the parameter --l2. We did this so you can also use trained models from other sources: for example, you can use the model from the original GeM paper's repository (passing --l2=after_pool) or the model from the repo that introduced the AP loss (passing --l2=none).

leefeng001 commented 2 years ago

thanks for your reply. one more question, if using L2Norm before GeM, the final feature descriptor was NOT normalized (its norm dose NOT equals to 1), when compare the similarity between feature in database and queries, should we use euclidean distance or cosine similarity?

gmberton commented 2 years ago

We always used the euclidean distance in all our experiments.

leefeng001 commented 2 years ago

Have you ever benchmark the result between euclidean distance and cosine similarity? And does it make sense to use euclidean distance among a set of vector which are NOT unified to the same scale? how do you think of this point?

gmberton commented 2 years ago

I haven't personally tried to use any other distances besides euclidean. The similarity should reflect the measures that is used in the loss: the standard triplet loss works with euclidean distance, but you can try the torch.nn.TripletMarginWithDistanceLoss if you want to experiment with other similarity measures (e.g. cosine). To check if these assumptions are correct, you could test our pretrained models (trained with euclidean, test with cosine) which will take just a few minutes (for a quick test you could download the st_lucia dataset), or you could train your own model (with our code training a ResNet-18 on pitts30k takes just a few hours).

leefeng001 commented 2 years ago

Ok, got it. I have no further questions, Thanks for your reply.