yuqinie98 / PatchTST

An offical implementation of PatchTST: "A Time Series is Worth 64 Words: Long-term Forecasting with Transformers." (ICLR 2023) https://arxiv.org/abs/2211.14730
Apache License 2.0
1.37k stars 248 forks source link

Question about batch_size, patch_len and stride #90

Open 1348598339 opened 7 months ago

1348598339 commented 7 months ago

In the PatchTST source code, I found batch_size = 128, which is just equal to stride(8)patch_len(16), may I ask if I should set the batch_size according to this pattern when I set the hyperparameters by myself, and whether it will lead to a decrease in the model's forecasting ability if my batch_size is much smaller than stridepatch_len?

ramonasl commented 6 months ago

I tested different seq_len, patch_len, strider and pred_len for a different dataset and the code return error. It only worked when the pattern you mention was followed, don`t know why yet.