GestaltCogTeam / BasicTS

A Fair and Scalable Time Series Forecasting Benchmark and Toolkit.
https://ieeexplore.ieee.org/document/10726722/
Apache License 2.0
716 stars 114 forks source link

Add Long-Term Time Series Forecasting Datasets & Add Long-Term Time Series Forecasting Baseline. #4

Closed zezhishao closed 2 years ago

Lip-Z commented 2 years ago

Can you add a way to handle the datasets in the Informer and Autoformer article, like ETT, weather, and electricity? Thanks!

zezhishao commented 2 years ago

Thanks for your interest! Actually, we are adding these datasets and models. We will finish this work in a few weeks.

zezhishao commented 2 years ago

The ETT series dataset has now been added to BasicTS. Descriptions of these datasets can be found here. Other datasets (eg electricity, weather, solar, exchange rates, Beijing air quality) will be added very soon. In addition, related models, such as Informer, Autoformer, FEDformer, Pyraformer, Linear Family, have also been added in BasicTS. However, I haven't had much time recently to write detailed documentation for BasicTS. So if you have any questions, feel free to create a new issue and I'll reply as soon as possible! @Lip-Z This also helps me write better documentation. Detailed documentation for BasicTS will be added in the coming months.

Lip-Z commented 2 years ago

Can't wait to follow your work, but I'm not particularly familiar with graph neural networks doing time series prediction or spatio-temporal series prediction before, and see that your configuration file has

CFG.DATASET_OUTPUT_LEN = 336 
CFG.TEST.EVALUATION_HORIZONS = [12, 24, 48, 96, 192, 288, 336]

wanna know according to the experiments you have done, a one-time output 336 and each according to the hroizon_length_list output experimental results will have a certain difference or not ? So do with the evaluation of time series prediction index practice seems to be a little different, try to do dlinear_input96_output96_ETTh1 experiment, the MAE results are very different(0.40(paper)--1.75(exp)), hope you can point out where the configuration is not right, thank you for your excellent work!

with log and config:

2022-11-11 11:22:21,212 - easytorch-training - INFO - Initializing training.
2022-11-11 11:22:21,217 - easytorch-training - INFO - Building training data loader.
2022-11-11 11:22:21,250 - easytorch-training - INFO - Set optim: Adam (
Parameter Group 0
    amsgrad: False
    betas: (0.9, 0.999)
    eps: 1e-08
    lr: 0.002
    weight_decay: 0.0001
)
2022-11-11 11:22:21,250 - easytorch-training - INFO - Set lr_scheduler: <torch.optim.lr_scheduler.MultiStepLR object at 0x0000022033902E10>
2022-11-11 11:22:21,255 - easytorch-training - INFO - Initializing validation.
2022-11-11 11:22:21,256 - easytorch-training - INFO - Building val data loader.
2022-11-11 11:22:21,273 - easytorch-training - INFO - Epoch 1 / 10
2022-11-11 11:22:26,853 - easytorch-training - INFO - Result <train>: [train_time: 5.58 (s), lr: 2.00e-03, train_MAE: 1.5263, train_RMSE: 2.5516, train_MAPE: 0.6469]
2022-11-11 11:22:26,856 - easytorch-training - INFO - Start validation.
2022-11-11 11:22:30,297 - easytorch-training - INFO - Result <val>: [val_time: 3.44 (s), val_MAE: 1.5534, val_RMSE: 2.8569, val_MAPE: 0.6939]
2022-11-11 11:22:30,301 - easytorch-training - INFO - Checkpoint checkpoints\DLinear_10\a01b2ed7f7d463e197a393f02418c7e4\DLinear_best_val_MAE.pt saved
2022-11-11 11:22:33,712 - easytorch-training - INFO - Evaluate best model on test data for horizon 96, Test MAE: 1.9284, Test RMSE: 3.5408, Test MAPE: 0.8597
2022-11-11 11:22:33,737 - easytorch-training - INFO - Result <test>: [test_time: 3.44 (s), test_MAE: 1.7996, test_RMSE: 3.3671, test_MAPE: 0.7934]
2022-11-11 11:22:33,742 - easytorch-training - INFO - Checkpoint checkpoints\DLinear_10\a01b2ed7f7d463e197a393f02418c7e4\DLinear_01.pt saved
2022-11-11 11:22:33,742 - easytorch-training - INFO - The estimated training finish time is 2022-11-11 11:24:25
2022-11-11 11:22:33,742 - easytorch-training - INFO - Epoch 2 / 10
2022-11-11 11:22:38,849 - easytorch-training - INFO - Result <train>: [train_time: 5.11 (s), lr: 1.00e-03, train_MAE: 1.4492, train_RMSE: 2.4562, train_MAPE: 0.6185]
2022-11-11 11:22:38,852 - easytorch-training - INFO - Start validation.
2022-11-11 11:22:42,339 - easytorch-training - INFO - Result <val>: [val_time: 3.49 (s), val_MAE: 1.4890, val_RMSE: 2.7779, val_MAPE: 0.6839]
2022-11-11 11:22:42,343 - easytorch-training - INFO - Checkpoint checkpoints\DLinear_10\a01b2ed7f7d463e197a393f02418c7e4\DLinear_best_val_MAE.pt saved
2022-11-11 11:22:45,679 - easytorch-training - INFO - Evaluate best model on test data for horizon 96, Test MAE: 1.9317, Test RMSE: 3.5568, Test MAPE: 0.8388
2022-11-11 11:22:45,700 - easytorch-training - INFO - Result <test>: [test_time: 3.36 (s), test_MAE: 1.7742, test_RMSE: 3.3588, test_MAPE: 0.7843]
2022-11-11 11:22:45,705 - easytorch-training - INFO - Checkpoint checkpoints\DLinear_10\a01b2ed7f7d463e197a393f02418c7e4\DLinear_02.pt saved
2022-11-11 11:22:45,705 - easytorch-training - INFO - The estimated training finish time is 2022-11-11 11:24:23
2022-11-11 11:22:45,705 - easytorch-training - INFO - Epoch 3 / 10
2022-11-11 11:22:50,796 - easytorch-training - INFO - Result <train>: [train_time: 5.09 (s), lr: 1.00e-03, train_MAE: 1.4450, train_RMSE: 2.4559, train_MAPE: 0.6186]
2022-11-11 11:22:50,798 - easytorch-training - INFO - Start validation.
2022-11-11 11:22:54,342 - easytorch-training - INFO - Result <val>: [val_time: 3.54 (s), val_MAE: 1.4898, val_RMSE: 2.7876, val_MAPE: 0.6749]
2022-11-11 11:22:57,700 - easytorch-training - INFO - Evaluate best model on test data for horizon 96, Test MAE: 1.9099, Test RMSE: 3.5304, Test MAPE: 0.8491
2022-11-11 11:22:57,721 - easytorch-training - INFO - Result <test>: [test_time: 3.38 (s), test_MAE: 1.7589, test_RMSE: 3.3392, test_MAPE: 0.7790]
2022-11-11 11:22:57,727 - easytorch-training - INFO - Checkpoint checkpoints\DLinear_10\a01b2ed7f7d463e197a393f02418c7e4\DLinear_03.pt saved
2022-11-11 11:22:57,727 - easytorch-training - INFO - The estimated training finish time is 2022-11-11 11:24:22
2022-11-11 11:22:57,727 - easytorch-training - INFO - Epoch 4 / 10
2022-11-11 11:23:02,875 - easytorch-training - INFO - Result <train>: [train_time: 5.15 (s), lr: 1.00e-03, train_MAE: 1.4487, train_RMSE: 2.4617, train_MAPE: 0.6176]
2022-11-11 11:23:02,878 - easytorch-training - INFO - Start validation.
2022-11-11 11:23:06,322 - easytorch-training - INFO - Result <val>: [val_time: 3.44 (s), val_MAE: 1.4768, val_RMSE: 2.7724, val_MAPE: 0.6702]
2022-11-11 11:23:06,326 - easytorch-training - INFO - Checkpoint checkpoints\DLinear_10\a01b2ed7f7d463e197a393f02418c7e4\DLinear_best_val_MAE.pt saved
2022-11-11 11:23:10,083 - easytorch-training - INFO - Evaluate best model on test data for horizon 96, Test MAE: 1.9418, Test RMSE: 3.5641, Test MAPE: 0.8596
2022-11-11 11:23:10,105 - easytorch-training - INFO - Result <test>: [test_time: 3.78 (s), test_MAE: 1.7505, test_RMSE: 3.3311, test_MAPE: 0.7736]
2022-11-11 11:23:10,128 - easytorch-training - INFO - Checkpoint checkpoints\DLinear_10\a01b2ed7f7d463e197a393f02418c7e4\DLinear_04.pt saved
2022-11-11 11:23:10,128 - easytorch-training - INFO - The estimated training finish time is 2022-11-11 11:24:23
2022-11-11 11:23:10,128 - easytorch-training - INFO - Epoch 5 / 10
2022-11-11 11:23:15,816 - easytorch-training - INFO - Result <train>: [train_time: 5.69 (s), lr: 1.00e-03, train_MAE: 1.4414, train_RMSE: 2.4523, train_MAPE: 0.6155]
2022-11-11 11:23:15,818 - easytorch-training - INFO - Start validation.
2022-11-11 11:23:19,943 - easytorch-training - INFO - Result <val>: [val_time: 4.13 (s), val_MAE: 1.4783, val_RMSE: 2.7707, val_MAPE: 0.6729]
2022-11-11 11:23:23,944 - easytorch-training - INFO - Evaluate best model on test data for horizon 96, Test MAE: 1.9218, Test RMSE: 3.5416, Test MAPE: 0.8399
2022-11-11 11:23:23,965 - easytorch-training - INFO - Result <test>: [test_time: 4.02 (s), test_MAE: 1.7564, test_RMSE: 3.3358, test_MAPE: 0.7735]
2022-11-11 11:23:23,970 - easytorch-training - INFO - Checkpoint checkpoints\DLinear_10\a01b2ed7f7d463e197a393f02418c7e4\DLinear_05.pt saved
2022-11-11 11:23:23,971 - easytorch-training - INFO - The estimated training finish time is 2022-11-11 11:24:26
2022-11-11 11:23:23,971 - easytorch-training - INFO - Epoch 6 / 10
2022-11-11 11:23:29,986 - easytorch-training - INFO - Result <train>: [train_time: 6.01 (s), lr: 1.00e-03, train_MAE: 1.4421, train_RMSE: 2.4564, train_MAPE: 0.6160]
2022-11-11 11:23:29,988 - easytorch-training - INFO - Start validation.
2022-11-11 11:23:34,116 - easytorch-training - INFO - Result <val>: [val_time: 4.13 (s), val_MAE: 1.5155, val_RMSE: 2.8202, val_MAPE: 0.6766]
2022-11-11 11:23:38,011 - easytorch-training - INFO - Evaluate best model on test data for horizon 96, Test MAE: 1.9129, Test RMSE: 3.5231, Test MAPE: 0.8345
2022-11-11 11:23:38,034 - easytorch-training - INFO - Result <test>: [test_time: 3.92 (s), test_MAE: 1.7604, test_RMSE: 3.3354, test_MAPE: 0.7784]
2022-11-11 11:23:38,075 - easytorch-training - INFO - Checkpoint checkpoints\DLinear_10\a01b2ed7f7d463e197a393f02418c7e4\DLinear_06.pt saved
2022-11-11 11:23:38,075 - easytorch-training - INFO - The estimated training finish time is 2022-11-11 11:24:29
2022-11-11 11:23:38,075 - easytorch-training - INFO - Epoch 7 / 10
2022-11-11 11:23:43,983 - easytorch-training - INFO - Result <train>: [train_time: 5.91 (s), lr: 1.00e-03, train_MAE: 1.4384, train_RMSE: 2.4471, train_MAPE: 0.6127]
2022-11-11 11:23:43,986 - easytorch-training - INFO - Start validation.
2022-11-11 11:23:48,280 - easytorch-training - INFO - Result <val>: [val_time: 4.29 (s), val_MAE: 1.5141, val_RMSE: 2.8147, val_MAPE: 0.6796]
2022-11-11 11:23:51,767 - easytorch-training - INFO - Evaluate best model on test data for horizon 96, Test MAE: 1.9015, Test RMSE: 3.5365, Test MAPE: 0.8478
2022-11-11 11:23:51,789 - easytorch-training - INFO - Result <test>: [test_time: 3.51 (s), test_MAE: 1.7676, test_RMSE: 3.3508, test_MAPE: 0.7836]
2022-11-11 11:23:51,809 - easytorch-training - INFO - Checkpoint checkpoints\DLinear_10\a01b2ed7f7d463e197a393f02418c7e4\DLinear_07.pt saved
2022-11-11 11:23:51,809 - easytorch-training - INFO - The estimated training finish time is 2022-11-11 11:24:30
2022-11-11 11:23:51,809 - easytorch-training - INFO - Epoch 8 / 10
2022-11-11 11:23:57,391 - easytorch-training - INFO - Result <train>: [train_time: 5.58 (s), lr: 1.00e-03, train_MAE: 1.4416, train_RMSE: 2.4542, train_MAPE: 0.6146]
2022-11-11 11:23:57,393 - easytorch-training - INFO - Start validation.
2022-11-11 11:24:01,104 - easytorch-training - INFO - Result <val>: [val_time: 3.71 (s), val_MAE: 1.4985, val_RMSE: 2.8021, val_MAPE: 0.6770]
2022-11-11 11:24:04,663 - easytorch-training - INFO - Evaluate best model on test data for horizon 96, Test MAE: 1.9120, Test RMSE: 3.5547, Test MAPE: 0.8586
2022-11-11 11:24:04,684 - easytorch-training - INFO - Result <test>: [test_time: 3.58 (s), test_MAE: 1.7571, test_RMSE: 3.3401, test_MAPE: 0.7824]
2022-11-11 11:24:04,741 - easytorch-training - INFO - Checkpoint checkpoints\DLinear_10\a01b2ed7f7d463e197a393f02418c7e4\DLinear_08.pt saved
2022-11-11 11:24:04,741 - easytorch-training - INFO - The estimated training finish time is 2022-11-11 11:24:30
2022-11-11 11:24:04,742 - easytorch-training - INFO - Epoch 9 / 10
2022-11-11 11:24:10,076 - easytorch-training - INFO - Result <train>: [train_time: 5.33 (s), lr: 1.00e-03, train_MAE: 1.4386, train_RMSE: 2.4504, train_MAPE: 0.6133]
2022-11-11 11:24:10,078 - easytorch-training - INFO - Start validation.
2022-11-11 11:24:13,581 - easytorch-training - INFO - Result <val>: [val_time: 3.50 (s), val_MAE: 1.5077, val_RMSE: 2.8126, val_MAPE: 0.6753]
2022-11-11 11:24:17,062 - easytorch-training - INFO - Evaluate best model on test data for horizon 96, Test MAE: 2.0243, Test RMSE: 3.6198, Test MAPE: 0.8651
2022-11-11 11:24:17,083 - easytorch-training - INFO - Result <test>: [test_time: 3.50 (s), test_MAE: 1.7610, test_RMSE: 3.3395, test_MAPE: 0.7806]
2022-11-11 11:24:17,088 - easytorch-training - INFO - Checkpoint checkpoints\DLinear_10\a01b2ed7f7d463e197a393f02418c7e4\DLinear_09.pt saved
2022-11-11 11:24:17,089 - easytorch-training - INFO - The estimated training finish time is 2022-11-11 11:24:29
2022-11-11 11:24:17,089 - easytorch-training - INFO - Epoch 10 / 10
2022-11-11 11:24:22,489 - easytorch-training - INFO - Result <train>: [train_time: 5.40 (s), lr: 1.00e-03, train_MAE: 1.4390, train_RMSE: 2.4491, train_MAPE: 0.6141]
2022-11-11 11:24:22,492 - easytorch-training - INFO - Start validation.
2022-11-11 11:24:25,978 - easytorch-training - INFO - Result <val>: [val_time: 3.49 (s), val_MAE: 1.4886, val_RMSE: 2.7880, val_MAPE: 0.6721]
2022-11-11 11:24:29,614 - easytorch-training - INFO - Evaluate best model on test data for horizon 96, Test MAE: 1.8979, Test RMSE: 3.5235, Test MAPE: 0.8357
2022-11-11 11:24:29,635 - easytorch-training - INFO - Result <test>: [test_time: 3.65 (s), test_MAE: 1.7505, test_RMSE: 3.3326, test_MAPE: 0.7759]
2022-11-11 11:24:29,640 - easytorch-training - INFO - Checkpoint checkpoints\DLinear_10\a01b2ed7f7d463e197a393f02418c7e4\DLinear_10.pt saved
2022-11-11 11:24:29,645 - easytorch-training - INFO - The training finished at 2022-11-11 11:24:29
DESCRIPTION: Linear model configuration
RUNNER: <class 'basicts.runners.runner_zoo.simple_tsf_runner.SimpleTimeSeriesForecastingRunner'>
DATASET_CLS: basicts.data.dataset.TimeSeriesForecastingDataset
DATASET_NAME: ETTh1
DATASET_TYPE: Electricity Transformer Temperature
DATASET_INPUT_LEN: 96
DATASET_OUTPUT_LEN: 96
GPU_NUM: 1
ENV:
  SEED: 1
  CUDNN:
    ENABLED: True
MODEL:
  NAME: DLinear
  ARCH: <class 'basicts.archs.arch_zoo.linear_arch.dlinear.DLinear'>
  PARAM:
    seq_len: 96
    pred_len: 96
    individual: False
    enc_in: 7
  FORWARD_FEATURES: [0]
  TARGET_FEATURES: [0]
TRAIN:
  LOSS: masked_mae
  OPTIM:
    TYPE: Adam
    PARAM:
      lr: 0.002
      weight_decay: 0.0001
  LR_SCHEDULER:
    TYPE: MultiStepLR
    PARAM:
      milestones: [1, 50, 80]
      gamma: 0.5
  NUM_EPOCHS: 10
  CKPT_SAVE_DIR: checkpoints\DLinear_10
  DATA:
    DIR: datasets/ETTh1
    BATCH_SIZE: 32
    PREFETCH: False
    SHUFFLE: True
    NUM_WORKERS: 2
    PIN_MEMORY: False
VAL:
  INTERVAL: 1
  DATA:
    DIR: datasets/ETTh1
    BATCH_SIZE: 64
    PREFETCH: False
    SHUFFLE: False
    NUM_WORKERS: 2
    PIN_MEMORY: False
TEST:
  EVALUATION_HORIZONS: [96]
  INTERVAL: 1
  DATA:
    DIR: datasets/ETTh1
    BATCH_SIZE: 64
    PREFETCH: False
    SHUFFLE: False
    NUM_WORKERS: 2
    PIN_MEMORY: False
MD5: a01b2ed7f7d463e197a393f02418c7e4
zezhishao commented 2 years ago

Thanks for your comment, let me answer bit by bit.

First, let me introduce default experimental settings. Given the CFG.DATASET_INPUT_LEN = 336 history data, the model (any model) will predict the future CFG.DATASET_OUTPUT_LEN = 336 data. Then, BasicTS will evaluate the performance of the model at horizon X. Here, CFG.TEST.EVALUATION_HORIZONS = [12, 24, 48, 96, 192, 288, 336] controls the X. After evaluating the performance at each horizon, BasicTS will also give the overall performance on all horizons. This is indeed a little bit different from the experimental settings of Informer and Autoformer.

Second, the difference in results comes from two aspects. On the one hand, Informer and Autoformer only used the first 20 months of the ETT data set, and BasicTS used the full amount of ETT data. You can find this in the dataset's description readme file. On the other hand, models such as Informer and Autoformer calculate the error on the normalized data, while BasicTS de-normalizes the prediction result to the original data space, and then calculates the error. The results obtained in this way are more intuitive and meaningful. Additionally, to ensure that the BasicTS implementation is correct, I have also tested the experimental settings used in Informer and Autoformer (i.e. using only the first 20 months of data, and calculating the error on the normalized data), and the experimental results is consistent with related papers.

Lip-Z commented 2 years ago

Thanks, it's very helpful !!!

zezhishao commented 2 years ago

Hi, Lip-Z @Lip-Z~ The datasets you asked about (ETT, Electricity, and Weather) have been added to BasicTS. Descriptions of these datasets can be found here. Kindly note that there are multiple versions of the Weather dataset in different papers, such as AutoFormer and Informer. I choose the AutoFormer version.

zezhishao commented 2 years ago

Long-term time series forecasting datasets ETT, Electricity, Weather, and ExchangeRate, have now been added to BasicTS. Long-term time series forecasting models Informer, Autoformer, FEDformer, Pyraformer, and Linear Family, have now been added to BasicTS.