gmberton / deep-visual-geo-localization-benchmark

Official code for CVPR 2022 (Oral) paper "Deep Visual Geo-localization Benchmark"
MIT License
186 stars 28 forks source link

Network structure of CCT-NetVLAD #22

Closed Steven-jiaqi closed 10 months ago

Steven-jiaqi commented 10 months ago

Thanks for your great work! I have some questions about network structure of CCT-Netvlad. I'm not sure if there is a Seqpooling layer in its structure or not. Which layer in the CCT is connected to the NetVLAD layer? I'm sorry to bother you. Looking forward to your reply!

ga1i13o commented 10 months ago

Hi, you can check in the code how the cct output is taken. For experiments with netvlad we do not apply any seqpool, and simply feed the tokens to the netvlad layer. As to which layer of the cct encoder is used, we experimented with it and found the best performing option was to truncate it at 8th layer was ( cli arg --trunc_te 8 )