An offical implementation of PatchTST: "A Time Series is Worth 64 Words: Long-term Forecasting with Transformers." (ICLR 2023) https://arxiv.org/abs/2211.14730
Hello,
I noticed that the
parser.add_argument('--d_ff', type=int, default=512, help='Tranformer MLP dimension')
is different in patchtst_finetune.py and patchtst_pretrain.py
This will lead to unmatched_layers when transfer weights.
What is the reason for this setting? Thanks
Hello, I noticed that the
parser.add_argument('--d_ff', type=int, default=512, help='Tranformer MLP dimension')
is different in patchtst_finetune.py and patchtst_pretrain.py This will lead to unmatched_layers when transfer weights. What is the reason for this setting? Thanks