How does the visualization of Attention Weights organize the code?

yuqinie98 / PatchTST

An offical implementation of PatchTST: "A Time Series is Worth 64 Words: Long-term Forecasting with Transformers." (ICLR 2023) https://arxiv.org/abs/2211.14730

Apache License 2.0

1.37k stars 248 forks source link

How does the visualization of Attention Weights organize the code? #97

Open choiseonb1n opened 5 months ago

choiseonb1n commented 5 months ago

The _MultiheadAttention class of PatchTST_backbone.py returns Attention_weight. However, I'm not sure how to use it to visualize it.

How do I organize the code to visualize it like Figure 6 in the paper?

111lzx111 commented 5 months ago

Thanks to the authors for their excellent time series forecasting work. I am having the same problem of how to visualize the attention weights shown in Figure 6 in the paper.