画图相关 - Githubissues

以模板64个token、搜索区256个token为例，注意力map形状为[320，320，head]，head为注意力头数，ViT-base中head为12。取注意力map中属于模板对搜索区的部分，即map[:64, -256:]，即得到每个模板token对每个搜索区token的交叉注意力（也可能是反过来的）。此时分割出的map形状为[64, 256, head]，如果再在head维度上取平均，然后再在第一维上取平均，就得到“模板中平均每个token在每个head上对搜索区中各token的注意力”，是一个256个元素的一维数组。把这个数组resize回16x16就能对应到搜索区的二维特征上，此时已经可以直接可视化了。想要好看一点就是把这个16x16的热力图插值回256x256，就能叠加到输入图像上。

---原始邮件--- 发件人: @.> 发送时间: 2024年5月19日(周日) 晚上6:30 收件人: @.>; 抄送: @.***>; 主题: [TsingWei/LiteTrack] 画图相关 (Issue #8)

请问如何绘制论文中图3的热力图呢？再次感谢您的开源工作。

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you are subscribed to this thread.Message ID: @.***>

TsingWei / LiteTrack

画图相关 #8