About Figure 3 (Visualizing attention weights)

smousavi05 / EQTransformer

EQTransformer, a python package for earthquake signal detection and phase picking using AI.

https://rebrand.ly/EQT-documentations

MIT License

301 stars 148 forks source link

About Figure 3 (Visualizing attention weights) #133

Closed cth127 closed 2 years ago

cth127 commented 2 years ago

Hi! I'm trying to apply EQTransformer to the South Korea dataset.

Though I succeeded in training and evaluating the f1 score for our dataset, I cannot find a code or method to visualize attention weights, as you did in Figure 3.

For the transformer layer, I verified the attention tensor shape is [BATCH_SIZE, 47, 47], but I'm still not sure how it can be interpreted as [BATCH_SIZE, 1, 6000] image.

smousavi05 commented 2 years ago

Hi @cth127 you need to do some averaging and processing to get the attention map. Check our Attention Rollout technique.

cth127 commented 2 years ago

@smousavi05 Thanks for your reply.

Though I tried to understand the attention rollout technique, I found out it is applicable between self-attention layers, not CNN or LSTM layers which are included in the EQT.

If you don't mind, can you share your code to reproduce figure 3? If you can't, can you give more detailed explanation about "averaging and processing"?

smousavi05 commented 2 years ago

@cth127 That is correct. We just use attention layers for the interpretation as only the attention weights are understandable for humans not the learned kernels in CNN or LSTM layers. The figures in EqT paper are only the attention weights at 3 particular attention layers. I tried to find my codes but surprisingly couldn't (that was a long time ago, back into 2018). However, as I recall it was not very complicated, and only some averaging. You just need to play with it a bit.

cth127 commented 2 years ago

Thanks for your reply. I'll try so!