Closed Yelrose closed 2 years ago
Hi, the evaluation is on the scaled values since the following two considerations: (1) In previous papers, like the baselines in Informer, they use this protocol. And we adopt this for a fair comparison. (2) Evaluation under the scaled value can balance different dimensions, which can also avoid that the metric can be dominated by one dimension.
Of course, you can project the time series back to the original space. If you have done this, you could use the sMAPE or MASE as the metrics.
I have found that in all the experiments, the original dataset is scaled by the StandardScaler. And all the implemented models are optimized by predicting the scaled ground truth. And you have written the inverse_transformation function but it is used nowhere.
https://github.com/thuml/Autoformer/blob/1c3ffb2d82115066674a2e8f4eb16c904917cf2d/data_provider/data_loader.py#L98
So I am slightly confused that all the results are evaluated under scaled values. Is this the tradition in time series forcasting?