Thinklab-SJTU / Crossformer

Official implementation of our ICLR 2023 paper "Crossformer: Transformer Utilizing Cross-Dimension Dependency for Multivariate Time Series Forecasting"
Apache License 2.0
476 stars 84 forks source link

Changing routers #26

Closed abase123 closed 2 months ago

abase123 commented 7 months ago

Hi, Sorry for a little beginner question. Would it be possible to modify the TSA such that we capture segment to segment attention across different time series ?

YunhaoZhang-Mars commented 7 months ago

TSA was originally designed to capture segment-to-segment attention across different time series you mentioned, but in two stages: 1)capturing dependency among different timestamps of the same dimension; 2)capturing dependency among different dimensions at the same step. Experimental results show this two-stage method is better than using only one attention layer for both time and dimension.

As for the router mechanism, it is designed to reduce the complexity of capturing cross-dimension dependency, i.e. stage 2 above. Of course, you can replace it with a singleTransformer layer, but the computation overhead will be too large for high-dimensional datasets.