Closed cth127 closed 2 years ago
Hi @cth127 you need to do some averaging and processing to get the attention map. Check our Attention Rollout technique.
@smousavi05 Thanks for your reply.
Though I tried to understand the attention rollout technique, I found out it is applicable between self-attention layers, not CNN or LSTM layers which are included in the EQT.
If you don't mind, can you share your code to reproduce figure 3? If you can't, can you give more detailed explanation about "averaging and processing"?
@cth127 That is correct. We just use attention layers for the interpretation as only the attention weights are understandable for humans not the learned kernels in CNN or LSTM layers. The figures in EqT paper are only the attention weights at 3 particular attention layers. I tried to find my codes but surprisingly couldn't (that was a long time ago, back into 2018). However, as I recall it was not very complicated, and only some averaging. You just need to play with it a bit.
Thanks for your reply. I'll try so!
Hi! I'm trying to apply EQTransformer to the South Korea dataset.
Though I succeeded in training and evaluating the f1 score for our dataset, I cannot find a code or method to visualize attention weights, as you did in Figure 3.
For the transformer layer, I verified the attention tensor shape is [BATCH_SIZE, 47, 47], but I'm still not sure how it can be interpreted as [BATCH_SIZE, 1, 6000] image.