Closed amiltonwong closed 3 years ago
Hi, @MenghaoGuo,
For Figure 1 in your paper, which self-attention layer is used to visualize the attention map? From your implementation, there are 4 self-attention layers (SA1, SA2, SA3, SA4) in the model.
Thanks~
Hi, @amiltonwong Like Vison Transformer, we visualize the attention map by using mean value of all attention layer.
@MenghaoGuo , OK, Thanks ~
Hi, @MenghaoGuo,
For Figure 1 in your paper, which self-attention layer is used to visualize the attention map? From your implementation, there are 4 self-attention layers (SA1, SA2, SA3, SA4) in the model.
Thanks~