kwuking / TimeMixer

[ICLR 2024] Official implementation of "TimeMixer: Decomposable Multiscale Mixing for Time Series Forecasting"
https://openreview.net/pdf?id=7oLshfEIC2
Apache License 2.0
1.14k stars 152 forks source link

More about the question about the forecastability #35

Closed mb-Ma closed 1 month ago

mb-Ma commented 2 months ago

I appreciate your quick reply. :-)

Q: It seems a counterintuitive phenomenon of forecastability. Table 1 shows that the weather and Electricity datasets possess high forecastabilities. In other words, they are easier to predict than these datasets with low forecastability, e.g., ETT (4 datasets). However, the ETT, traffic datasets have obvious periodic patterns derived from the visualizations and ACF. Especially, it is harder to find the pattern to fit, as shown in Fig 12 of the paper. Could you provide some insights into the above question?

A: Thank you very much for your attention to our work. Forecastability is a key issue in time series analysis. Currently, our forecastability calculations are performed through global sampling on the time series. The ETT dataset is actually recognized as one with a higher degree of forecasting difficulty. The visual showcases only display a portion of the content and do not represent the complete time series. Compared to larger datasets such as Electricity and Traffic, the patterns of cycles and trends are more evident, resulting in stronger predictability.

Moreover, How about the weather dataset? No matter the complete or partial time series, there are no obvious cycles or trends. Why is it easy to predict?

kwuking commented 2 months ago

This is indeed a very pertinent question. When examining visualizations alone, handling weather data presents notable challenges. Although the forecastability analysis of the weather dataset appears promising, this may be attributed to the specific method of forecastability analysis employed. As detailed in our paper, the current forecastability analysis is conducted through fitting analysis using Fourier series. It is noteworthy that "Weather is recorded every 10 minutes throughout the year 2020, encompassing 21 meteorological indicators, such as air temperature, humidity, etc." This implies that, as a meteorological dataset, weather data exhibits pronounced seasonality, which could explain why the forecastability measurement method based on Fourier analysis yields favorable results. Furthermore, in comparison to datasets such as ETT, the weather dataset contains fewer missing and outlier values. The ETT dataset, which comprises data on electric transformer temperatures, includes a substantial amount of missing and anomalous data due to collection and sensor issues. Consequently, the weather dataset demonstrates superior performance in this regard, making it more favorable in terms of forecastability analysis compared to the ETT dataset.

Moreover, we should consider more rational methods of forecastability analysis for time series datasets. In other words, what constitutes an effective forecastability analysis of time series data? Although classical theories have provided numerous explanations, they all encounter limitations when faced with continuously growing time series data. We are actively exploring this issue and welcome any innovative ideas or further communication on this topic.

Once again, I would like to express my gratitude for the constructive question you raised. Deep contemplation surpasses mere practice; a well-posed question can significantly advance research development. Since the publication of Informer and Autoformer in 2021 and 2022, the time series community has experienced unprecedented growth. Data is a reflection of real-world problems, and excellent research necessitates starting from data to uncover the underlying essence. Current research on time series data lags behind that on image and text data, and it requires our collective effort to foster the prosperity and advancement of the community.

mb-Ma commented 1 month ago

Many thanks for your thorough analysis. Especially, the last words mean a lot to the researchers in the time series community, of course, including me.