facebookresearch / Kats

Kats, a kit to analyze time series data, a lightweight, easy-to-use, generalizable, and extendable framework to perform time series analysis, from understanding the key statistics and characteristics, detecting change points and anomalies, to forecasting future trends.
MIT License
4.88k stars 534 forks source link

GMModel Tutorial Bug #256

Closed ericho-bbai closed 1 year ago

ericho-bbai commented 2 years ago

In the Global Model example section 2.2 (https://github.com/facebookresearch/Kats/blob/main/tutorials/kats_205_globalmodel.ipynb), the test_TSs was generated in the same way as train_TSs with the same start time. Is this a typo? Wouldn't there be data leakage since test_TSs is essentially a subset of train_TSs?

Also why does GMModel take a list of TimeSeriesData? If we have one timeseries, are we supposed to create a list of TimeSeriesData via the expanding window method?

yangbk560 commented 1 year ago

Thanks for the questions! For 1) when training a GM, say we feed in a TS with end date '2022-02-01'. Based on the model setting, it will only take the data util '2022-01-01' into the NN and use data between '2022-01-02'~'2022-02-01' to compute loss functions. In prediction stage, it will take data until '2022-02-01' to make forecast for dates after '2022-02-01'. In other words, GM does not see info after '2022-02-01', there is no concern for data leakage. In real use-case (and our evaluation), test_TSs can be unseen time series (i.e., not appearing in train_TSs). 2) It should be able to directly take a TimeSeriesData object. :)