thuml / Autoformer

About Code release for "Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting" (NeurIPS 2021), https://arxiv.org/abs/2106.13008
MIT License
2k stars 429 forks source link

mae, rmse,mape降不下去 #122

Closed yinshuisiyuan123 closed 1 year ago

yinshuisiyuan123 commented 1 year ago

对于pems08数据集训练,模型怎么都不收敛,评估指标mae,mape也一直上下跳动,一直下不去,这是为啥?还望作者大大指教

start training : test_Autoformer_pems08_ftM_sl96_ll48_pl96_dm512_nh8_el2_dl1_df2048_fc1_ebtimeF_dtTrue_test_89>>>>>>>>>>>>>>>>>>>>>>>>>> train 12308 val 1691 test 3476 iters: 100, epoch: 1 | loss: 0.5972405 speed: 0.0680s/iter; left time: 123.7714s iters: 200, epoch: 1 | loss: 0.3244367 speed: 0.0665s/iter; left time: 114.3762s iters: 300, epoch: 1 | loss: 0.2828257 speed: 0.0677s/iter; left time: 109.7648s Epoch: 1 cost time: 25.971481323242188 Epoch: 1, Steps: 384 | Train Loss: 0.5607977 Vali Loss: 0.6211812 Test Loss: 0.5959429 Validation loss decreased (inf --> 0.621181). Saving model ... Updating learning rate to 0.0001 iters: 100, epoch: 2 | loss: 0.2603135 speed: 0.2191s/iter; left time: 314.7798s iters: 200, epoch: 2 | loss: 0.2419676 speed: 0.0690s/iter; left time: 92.3196s iters: 300, epoch: 2 | loss: 0.2227196 speed: 0.0697s/iter; left time: 86.1973s Epoch: 2 cost time: 26.521819829940796 Epoch: 2, Steps: 384 | Train Loss: 0.2397186 Vali Loss: 0.5556042 Test Loss: 0.5396060 Validation loss decreased (0.621181 --> 0.555604). Saving model ... Updating learning rate to 5e-05 iters: 100, epoch: 3 | loss: 0.2063853 speed: 0.2190s/iter; left time: 230.6327s iters: 200, epoch: 3 | loss: 0.1783728 speed: 0.0687s/iter; left time: 65.4935s iters: 300, epoch: 3 | loss: 0.1871052 speed: 0.0691s/iter; left time: 58.9800s Epoch: 3 cost time: 26.485258102416992 Epoch: 3, Steps: 384 | Train Loss: 0.1957733 Vali Loss: 0.5620660 Test Loss: 0.5204029 EarlyStopping counter: 1 out of 3 Updating learning rate to 2.5e-05 iters: 100, epoch: 4 | loss: 0.1694388 speed: 0.2165s/iter; left time: 144.8561s iters: 200, epoch: 4 | loss: 0.1679628 speed: 0.0687s/iter; left time: 39.1142s iters: 300, epoch: 4 | loss: 0.1728195 speed: 0.0687s/iter; left time: 32.2322s Epoch: 4 cost time: 26.37934637069702 Epoch: 4, Steps: 384 | Train Loss: 0.1790579 Vali Loss: 0.5579699 Test Loss: 0.5247468 EarlyStopping counter: 2 out of 3 Updating learning rate to 1.25e-05 iters: 100, epoch: 5 | loss: 0.1737866 speed: 0.2157s/iter; left time: 61.4690s iters: 200, epoch: 5 | loss: 0.1680229 speed: 0.0697s/iter; left time: 12.8908s iters: 300, epoch: 5 | loss: 0.1618070 speed: 0.0692s/iter; left time: 5.8795s Epoch: 5 cost time: 26.531216859817505 Epoch: 5, Steps: 384 | Train Loss: 0.1716047 Vali Loss: 0.5490247 Test Loss: 0.5201393 Validation loss decreased (0.555604 --> 0.549025). Saving model ... Updating learning rate to 6.25e-06 testing : test_Autoformer_pems08_ftM_sl96_ll48_pl96_dm512_nh8_el2_dl1_df2048_fc1_ebtimeF_dtTrue_test_89<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< test 3476 **mae:0.5553721189498901, rmse;0.7215896248817444, mape:0.5648117065429688*, Use GPU: cuda:0 start training : test_Autoformer_pems08_ftM_sl96_ll48_pl96_dm512_nh8_el2_dl1_df2048_fc1_ebtimeF_dtTrue_test_90>>>>>>>>>>>>>>>>>>>>>>>>>> train 12308 val 1691 test 3476 iters: 100, epoch: 1 | loss: 0.9161742 speed: 0.0752s/iter; left time: 136.9370s iters: 200, epoch: 1 | loss: 0.6715516 speed: 0.0687s/iter; left time: 118.2646s iters: 300, epoch: 1 | loss: 0.4367612 speed: 0.0670s/iter; left time: 108.5884s Epoch: 1 cost time: 26.593833923339844 Epoch: 1, Steps: 384 | Train Loss: 0.8693575 Vali Loss: 0.8793483 Test Loss: 0.8582214 Validation loss decreased (inf --> 0.879348). Saving model ... Updating learning rate to 0.0001 iters: 100, epoch: 2 | loss: 0.3586412 speed: 0.2091s/iter; left time: 300.4331s iters: 200, epoch: 2 | loss: 0.2872029 speed: 0.0633s/iter; left time: 84.6365s iters: 300, epoch: 2 | loss: 0.2458006 speed: 0.0631s/iter; left time: 78.0914s Epoch: 2 cost time: 24.501741886138916 Epoch: 2, Steps: 384 | Train Loss: 0.2927673 Vali Loss: 0.6732510 Test Loss: 0.7210842 Validation loss decreased (0.879348 --> 0.673251). Saving model ... Updating learning rate to 5e-05 iters: 100, epoch: 3 | loss: 0.2268723 speed: 0.2131s/iter; left time: 224.4305s iters: 200, epoch: 3 | loss: 0.2363658 speed: 0.0726s/iter; left time: 69.2266s iters: 300, epoch: 3 | loss: 0.2205394 speed: 0.0731s/iter; left time: 62.3169s Epoch: 3 cost time: 27.667537450790405 Epoch: 3, Steps: 384 | Train Loss: 0.2215831 Vali Loss: 0.6421157 Test Loss: 0.6791905 Validation loss decreased (0.673251 --> 0.642116). Saving model ... Updating learning rate to 2.5e-05 iters: 100, epoch: 4 | loss: 0.1951914 speed: 0.2226s/iter; left time: 148.9509s iters: 200, epoch: 4 | loss: 0.1836035 speed: 0.0637s/iter; left time: 36.2716s iters: 300, epoch: 4 | loss: 0.1968194 speed: 0.0639s/iter; left time: 29.9734s Epoch: 4 cost time: 24.604705572128296 Epoch: 4, Steps: 384 | Train Loss: 0.1986220 Vali Loss: 0.6427053 Test Loss: 0.6906134 EarlyStopping counter: 1 out of 3 Updating learning rate to 1.25e-05 iters: 100, epoch: 5 | loss: 0.1871775 speed: 0.2140s/iter; left time: 60.9974s iters: 200, epoch: 5 | loss: 0.1852812 speed: 0.0792s/iter; left time: 14.6518s iters: 300, epoch: 5 | loss: 0.1841914 speed: 0.0735s/iter; left time: 6.2512s Epoch: 5 cost time: 28.218132257461548 Epoch: 5, Steps: 384 | Train Loss: 0.1896634 Vali Loss: 0.6409395 Test Loss: 0.6854959 Validation loss decreased (0.642116 --> 0.640939). Saving model ... Updating learning rate to 6.25e-06 testing : test_Autoformer_pems08_ftM_sl96_ll48_pl96_dm512_nh8_el2_dl1_df2048_fc1_ebtimeF_dtTrue_test_90<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< test 3476 **mae:0.6442457437515259, rmse;0.8279563784599304, mape:0.6413577795028687*, Use GPU: cuda:0 start training : test_Autoformer_pems08_ftM_sl96_ll48_pl96_dm512_nh8_el2_dl1_df2048_fc1_ebtimeF_dtTrue_test_91>>>>>>>>>>>>>>>>>>>>>>>>>> train 12308 val 1691 test 3476 iters: 100, epoch: 1 | loss: 0.6700876 speed: 0.0706s/iter; left time: 128.4932s iters: 200, epoch: 1 | loss: 0.3434948 speed: 0.0656s/iter; left time: 112.9150s iters: 300, epoch: 1 | loss: 0.4323801 speed: 0.0640s/iter; left time: 103.8033s Epoch: 1 cost time: 25.79011583328247 Epoch: 1, Steps: 384 | Train Loss: 0.6329591 Vali Loss: 0.7022343 Test Loss: 0.6559243 Validation loss decreased (inf --> 0.702234). Saving model ... Updating learning rate to 0.0001 iters: 100, epoch: 2 | loss: 0.4504840 speed: 0.2260s/iter; left time: 324.8045s iters: 200, epoch: 2 | loss: 0.3010348 speed: 0.0675s/iter; left time: 90.2433s iters: 300, epoch: 2 | loss: 0.2352337 speed: 0.0655s/iter; left time: 81.0743s Epoch: 2 cost time: 26.33907198905945 Epoch: 2, Steps: 384 | Train Loss: 0.2835468 Vali Loss: 0.6222830 Test Loss: 0.6060570 Validation loss decreased (0.702234 --> 0.622283). Saving model ... Updating learning rate to 5e-05 iters: 100, epoch: 3 | loss: 0.1997769 speed: 0.2327s/iter; left time: 245.0083s iters: 200, epoch: 3 | loss: 0.2075001 speed: 0.0653s/iter; left time: 62.2372s iters: 300, epoch: 3 | loss: 0.2068488 speed: 0.0661s/iter; left time: 56.4175s Epoch: 3 cost time: 25.94714069366455 Epoch: 3, Steps: 384 | Train Loss: 0.2163367 Vali Loss: 0.5951890 Test Loss: 0.5625675 Validation loss decreased (0.622283 --> 0.595189). Saving model ... Updating learning rate to 2.5e-05 iters: 100, epoch: 4 | loss: 0.1988065 speed: 0.2053s/iter; left time: 137.3309s iters: 200, epoch: 4 | loss: 0.2034477 speed: 0.0676s/iter; left time: 38.4469s iters: 300, epoch: 4 | loss: 0.1941954 speed: 0.0685s/iter; left time: 32.1371s Epoch: 4 cost time: 25.743055820465088 Epoch: 4, Steps: 384 | Train Loss: 0.1933183 Vali Loss: 0.5954826 Test Loss: 0.5615995 EarlyStopping counter: 1 out of 3 Updating learning rate to 1.25e-05 iters: 100, epoch: 5 | loss: 0.1752377 speed: 0.2232s/iter; left time: 63.6047s iters: 200, epoch: 5 | loss: 0.1832344 speed: 0.0657s/iter; left time: 12.1494s iters: 300, epoch: 5 | loss: 0.1852956 speed: 0.0712s/iter; left time: 6.0557s Epoch: 5 cost time: 26.00613260269165 Epoch: 5, Steps: 384 | Train Loss: 0.1845648 Vali Loss: 0.5974851 Test Loss: 0.5625376 EarlyStopping counter: 2 out of 3 Updating learning rate to 6.25e-06 testing : test_Autoformer_pems08_ftM_sl96_ll48_pl96_dm512_nh8_el2_dl1_df2048_fc1_ebtimeF_dtTrue_test_91<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< test 3476 **mae:0.5793089270591736, rmse;0.7503997087478638, mape:0.5978106260299683*, Use GPU: cuda:0 start training : test_Autoformer_pems08_ftM_sl96_ll48_pl96_dm512_nh8_el2_dl1_df2048_fc1_ebtimeF_dtTrue_test_92>>>>>>>>>>>>>>>>>>>>>>>>>> train 12308 val 1691 test 3476 iters: 100, epoch: 1 | loss: 0.6688349 speed: 0.0694s/iter; left time: 126.3362s iters: 200, epoch: 1 | loss: 0.5156122 speed: 0.0650s/iter; left time: 111.7951s iters: 300, epoch: 1 | loss: 0.3769550 speed: 0.0651s/iter; left time: 105.5919s Epoch: 1 cost time: 25.33676838874817 Epoch: 1, Steps: 384 | Train Loss: 0.6410224 Vali Loss: 0.9100465 Test Loss: 0.8793117 Validation loss decreased (inf --> 0.910047). Saving model ... Updating learning rate to 0.0001 iters: 100, epoch: 2 | loss: 0.2804989 speed: 0.2007s/iter; left time: 288.4022s iters: 200, epoch: 2 | loss: 0.2415302 speed: 0.0632s/iter; left time: 84.5459s iters: 300, epoch: 2 | loss: 0.2233577 speed: 0.0678s/iter; left time: 83.9226s Epoch: 2 cost time: 24.852391958236694 Epoch: 2, Steps: 384 | Train Loss: 0.2618418 Vali Loss: 0.7589887 Test Loss: 0.7649336 Validation loss decreased (0.910047 --> 0.758989). Saving model ... Updating learning rate to 5e-05 iters: 100, epoch: 3 | loss: 0.2005164 speed: 0.2083s/iter; left time: 219.3379s iters: 200, epoch: 3 | loss: 0.1999879 speed: 0.0683s/iter; left time: 65.0460s iters: 300, epoch: 3 | loss: 0.1875636 speed: 0.0636s/iter; left time: 54.2671s Epoch: 3 cost time: 24.95496678352356 Epoch: 3, Steps: 384 | Train Loss: 0.2048777 Vali Loss: 0.6461186 Test Loss: 0.6475055 Validation loss decreased (0.758989 --> 0.646119). Saving model ... Updating learning rate to 2.5e-05 iters: 100, epoch: 4 | loss: 0.1747726 speed: 0.2119s/iter; left time: 141.7312s iters: 200, epoch: 4 | loss: 0.1885661 speed: 0.0686s/iter; left time: 39.0094s iters: 300, epoch: 4 | loss: 0.1728102 speed: 0.0697s/iter; left time: 32.6743s Epoch: 4 cost time: 25.750203132629395 Epoch: 4, Steps: 384 | Train Loss: 0.1857958 Vali Loss: 0.6452948 Test Loss: 0.6535000 Validation loss decreased (0.646119 --> 0.645295). Saving model ... Updating learning rate to 1.25e-05 iters: 100, epoch: 5 | loss: 0.1608830 speed: 0.2100s/iter; left time: 59.8436s iters: 200, epoch: 5 | loss: 0.1749683 speed: 0.0679s/iter; left time: 12.5563s iters: 300, epoch: 5 | loss: 0.1786580 speed: 0.0653s/iter; left time: 5.5511s Epoch: 5 cost time: 25.386012077331543 Epoch: 5, Steps: 384 | Train Loss: 0.1771732 Vali Loss: 0.6412984 Test Loss: 0.6445554 Validation loss decreased (0.645295 --> 0.641298). Saving model ... Updating learning rate to 6.25e-06 testing : test_Autoformer_pems08_ftM_sl96_ll48_pl96_dm512_nh8_el2_dl1_df2048_fc1_ebtimeF_dtTrue_test_92<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< test 3476 **mae:0.6300687789916992, rmse;0.8026465177536011, mape:0.649153470993042*, Use GPU: cuda:0

wuhaixu2016 commented 1 year ago

您好,感谢关注!pems08数据集是交通流量数据,其不同传感器之间的观测到的数据分布具有明显的差别,我觉得可以在Autoformer的基础上使用Non-stationary Transformer中提出的Series stationarization模块来是的模型对于多样化分布更加鲁棒。

Non-stationary Transformer:https://github.com/thuml/Nonstationary_Transformers