tidyverts / feasts

Feature Extraction And Statistics for Time Series
https://feasts.tidyverts.org/
291 stars 23 forks source link

[Question] Retrieve STL used window values #154

Open david-vicente opened 1 year ago

david-vicente commented 1 year ago

For STL, is there a way to retrieve the used window parameter values after fitting the model?

If we use the str function,

my_tsibble %>% model(STL(column ~ trend())) %>% str()

we can find all the "learned" periods

.. .. .. ..$ season_week:List of 2
.. .. .. .. ..$ period: num 168
.. .. .. .. ..$ base  : num 0
.. .. .. ..$ season_day :List of 2
.. .. .. .. ..$ period: num 24
.. .. .. .. ..$ base  : num 0

but I can't find the other parameters such as the window size.

mitchelloharawild commented 1 year ago

It's stored in the model object, but this isn't intended for users. If needed I can expose it in the user accessible methods for the model. Are you trying to obtain the defaults for these windows?

david-vicente commented 1 year ago

Are you trying to obtain the defaults for these windows?

Yeah. Since in the example above I haven't defined a window for trend(), which according to the documentation depends on the period and window size of the seasons, I'm not really sure what was the chosen window.

The documentation says

nextodd(ceiling((1.5*period) / (1-(1.5/s.window))))

but because this is an implementation of MSTL we have more than one season, each with its respective window and period. I would like to be able to retrieve the defaults picked in this case.

Another situation is when we set the window for trend (and the model extracts 2 seasonal components):

my_tsibble %>% model(STL(column ~ trend(window=11)))

here I'm assuming that the windows used are

trend_window = 11 season1_window = 11 season2_window = 15

since the documentation states

The default (NULL) will choose an appropriate default, for a dataset with one seasonal pattern this would be 11, the second larger seasonal window would be 15, then 19, 23, ... onwards.

Am I correct?

Nonetheless, I think that exposing this to the user would be beneficial, so that for example one could train a decomposition model in R, retrieve the values, and use them in another implementation for comparison, lets say in Python.