An offical implementation of PatchTST: "A Time Series is Worth 64 Words: Long-term Forecasting with Transformers." (ICLR 2023) https://arxiv.org/abs/2211.14730
Apache License 2.0
1.37k
stars
248
forks
source link
How does the visualization of Attention Weights organize the code? #97
Thanks to the authors for their excellent time series forecasting work.
I am having the same problem of how to visualize the attention weights shown in Figure 6 in the paper.
The _MultiheadAttention class of PatchTST_backbone.py returns Attention_weight. However, I'm not sure how to use it to visualize it.
How do I organize the code to visualize it like Figure 6 in the paper?