yuqinie98 / PatchTST

An offical implementation of PatchTST: "A Time Series is Worth 64 Words: Long-term Forecasting with Transformers." (ICLR 2023) https://arxiv.org/abs/2211.14730
Apache License 2.0
1.37k stars 248 forks source link

RevInCB and PatchMaskCB #35

Closed ikvision closed 1 year ago

ikvision commented 1 year ago

In the current implementation the forward path first applies normalization and then applies masking. https://github.com/yuqinie98/PatchTST/blob/de8d7f0da12f4af1bfe13de3d7fe0b888bd84ea9/PatchTST_self_supervised/patchtst_supervised.py#L96-L97 Therefore the RevInCB mean and std are calculated on the non-masked inputs. I think the RevInCB normalization can reveal the masked patches and assist the algorithm to recover pattern that are hidden if they are significantly different than the non-masked regions. Is it the intended behavior?

yuqinie98 commented 1 year ago

Hi @ikvision , thanks very much for bringing this question up! This is a very reasonable thought and we haven't done serious ablation on that. RevIn would be a little bit difficult after patching, since an extra dimension is introduced.