ubc-vision / COTR

Code release for "COTR: Correspondence Transformer for Matching Across Images"(ICCV 2021)
Apache License 2.0
460 stars 58 forks source link

question about positional encoding #8

Closed XiaoyuShi97 closed 3 years ago

XiaoyuShi97 commented 3 years ago

Hi. According to formula (4) in your paper, you add positional encoding P to get a context feature map c. But in your code, you just follow transformer to add positional encoding to key and query and keep value clean? Did I miss anything?

jiangwei221 commented 3 years ago

Hi. Yes, the positional encoded feature map is fed to the transformer encoder as the query and key, while the original feature map is served as the values. We mainly followed the design of DETR for the backbone and transformer.