yuqinie98 / PatchTST

An offical implementation of PatchTST: "A Time Series is Worth 64 Words: Long-term Forecasting with Transformers." (ICLR 2023) https://arxiv.org/abs/2211.14730
Apache License 2.0
1.37k stars 248 forks source link

value of 'd_ff' #67

Closed dawn0713 closed 10 months ago

dawn0713 commented 11 months ago

Hello, I noticed that the parser.add_argument('--d_ff', type=int, default=512, help='Tranformer MLP dimension') is different in patchtst_finetune.py and patchtst_pretrain.py This will lead to unmatched_layers when transfer weights. What is the reason for this setting? Thanks

yuqinie98 commented 11 months ago

Hi! Sorry for the confusing. Is the problem solved if you modify it?

dawn0713 commented 10 months ago

Hi,I modified the setting, and successfully transferred the model. Thank you.