Update the evaluation criterion and experimental results

chaoshangcs / GTS

Discrete Graph Structure Learning for Forecasting Multiple Time Series, ICLR 2021.

Apache License 2.0

171 stars 30 forks source link

Update the evaluation criterion and experimental results #5

Closed LMissher closed 3 years ago

LMissher commented 3 years ago

From the lines from 174 to 182 of supervisor.py, it can be seen that the average value y_pred[:3], y_true[:3] of the predicted time series is used as the evaluation criterion.

The previous works used the value of the last time slice y_pred[2:3], y_true[2:3] of the prediction sequence as the evaluation criterion.

I hope that the authors can make a fair comparison and change the results in the paper to the value of the last time slice instead of the average value.

chaoshangcs commented 3 years ago

Thanks for your great suggestion. We will check this problem and update the experiments and code.

chaoshangcs commented 3 years ago

We double-checked the https://github.com/chnsh/DCRNN_PyTorch code and slightly modified this function. Now the code has the same implementation of the evaluation function comparing the DCRNN_PyTorch. Thanks for your great question. If you have any suggestions, please let me know. : )

LMissher commented 3 years ago

For details, please refer to the following link https://github.com/chnsh/DCRNN_PyTorch/issues/3#issuecomment-655412195

LMissher commented 3 years ago

such as line 115 in https://github.com/zhengchuanpan/GMAN/blob/master/METR/test.py and line 77, 78 in https://github.com/nnzhan/Graph-WaveNet/blob/master/test.py

chaoshangcs commented 3 years ago

For details, please refer to the following link chnsh/DCRNN_PyTorch#3 (comment)

Thanks for your sharing. It is great to know this issue of the DCRNN_Pytorch and the related works. It seems that this PyTorch code calculates the average values and the TensorFlow one gets the value of the last timestep.

LMissher commented 3 years ago

The loss calculation in the training phase uses the average values, and the final test results and the paper results should be a separate time steps. Looking forward to your revision of your results.

chaoshangcs commented 3 years ago

The loss calculation in the training phase uses the average values, and the final test results and the paper results should be a separate time steps. Looking forward to your revision of your results.

Sure. We will update the results. Thanks for your help and time.

chaoshangcs commented 3 years ago

The loss calculation in the training phase uses the average values, and the final test results and the paper results should be a separate time steps. Looking forward to your revision of your results.

Sure. We will update the results. Thanks for your help and time.

Hi. I have quickly run our experiments using new metrics without hyperparameter tuning. Here are the results for your reference. GTS is still outperformed the DCRNN. We will update them on our paper soon.

METR-LA: 15mins MAE:2.64 RMSE: 4.95 MAPE: 6.8% 30mins MAE:3.01 RMSE: 5.85 MAPE: 8.2% 60mins MAE:3.41 RMSE: 6.74 MAPE: 9.9%

PEMS-BAY: 15mins MAE:1.32 RMSE: 2.62 MAPE: 2.8% 30mins MAE:1.64 RMSE: 3.41 MAPE: 3.6% 60mins MAE:1.91 RMSE: 3.97 MAPE: 4.4%

LMissher commented 3 years ago

Thank you for the new results soon, and I’m glad to discuss with you.

chaoshangcs commented 3 years ago

Thank you for the new results soon, and I’m glad to discuss with you.

Thanks! The new results have been updated to our paper.