bytedance / R2Former

Official repository for R2Former: Unified Retrieval and Reranking Transformer for Place Recognition
Apache License 2.0
83 stars 6 forks source link

Question about memory usage for local features in Fig.2 #3

Closed xjh19971 closed 1 year ago

xjh19971 commented 1 year ago

Thank you for your excellent research! I have a question about Fig. 2, specifically the memory usage for local descriptors. During Global Retrieval, the model outputs 1200 local feature tokens but selects 500 tokens with the highest attention values. However, in Table 3, the memory usage calculation seems to consider only 500 tokens, not the peak of 1200. Could you clarify if I'm misunderstanding something? Thank you!

Jeff-Zilence commented 1 year ago

During global retrieval, only the class token is saved/used. During the reranking, only the 500 tokens are saved in the memory, the others are not saved or used.

Jeff-Zilence commented 1 year ago

You can check the code for details.

xjh19971 commented 1 year ago

Thank you!