thuml / Time-Series-Library

A Library for Advanced Deep Time Series Models.
MIT License
6.49k stars 1.03k forks source link

设置TimeMixer的seq_len=336, 512时ETTm1的结果不合理 #506

Closed Gczmy closed 1 month ago

Gczmy commented 1 month ago

实验结果:

long_term_forecast_ETTm1_512_96_TimeMixer_ETTm1_ftM_sl512_ll0_pl96_dm16_nh8_el2_dl1_df32_expand2_dc4_fc1_ebtimeF_dtTrue_Exp_0  
mse:1585206016.0, mae:23201.28515625, dtw:-999

long_term_forecast_ETTm1_512_192_TimeMixer_ETTm1_ftM_sl512_ll0_pl192_dm16_nh8_el2_dl1_df32_expand2_dc4_fc1_ebtimeF_dtTrue_Exp_0  
mse:0.3479084074497223, mae:0.38368070125579834, dtw:-999

long_term_forecast_ETTm1_512_336_TimeMixer_ETTm1_ftM_sl512_ll0_pl336_dm16_nh8_el2_dl1_df32_expand2_dc4_fc1_ebtimeF_dtTrue_Exp_0  
mse:0.3879052400588989, mae:0.4074249863624573, dtw:-999

long_term_forecast_ETTm1_512_720_TimeMixer_ETTm1_ftM_sl512_ll0_pl720_dm16_nh8_el2_dl1_df32_expand2_dc4_fc1_ebtimeF_dtTrue_Exp_0  
mse:3220697.75, mae:1166.6685791015625, dtw:-999

如上,pred_len=96,720 的结果太大,目前仅在ETTm1上观察到此现象。设置seq_len=96, 192时正常。但其实在训练过程中,1000+iteration时就能观察到训练损失很大。请问可能是什么原因呢?

kwuking commented 1 month ago

你好,感谢对于我们工作的关注,根据日志看下来可能的原因是learning_rate设置过大,建议可以把learning_rate调试到0.001尝试下,此外可以使用timemixer的官方代码库

wuhaixu2016 commented 1 month ago

感谢回复 @kwuking