tidyverts / feasts

Feature Extraction And Statistics for Time Series
https://feasts.tidyverts.org/
291 stars 23 forks source link

generate.STL #122

Closed robjhyndman closed 3 years ago

robjhyndman commented 3 years ago

generate.STL() could be a nice way to include the functionality of forecast::bld.mbb.bootstrap(). It would be different from the other generate.xxx methods in fable that the time index would be the same as the training data, rather than future to the training data.

As a side issue, it seems we have lost the ability to generate from other models with future = FALSE.

robjhyndman commented 3 years ago

I can have a go at writing generate.STL() if you agree in principle to this idea.

mitchelloharawild commented 3 years ago

I agree with this in principle as a user-customisable generate() method. Possibly generate(<STL>, new_data, method = decomposition_bootstrap). Although, would it be possible (and make sense) to allow generate(<dable>, method = decomposition_bootstrap) also? As far as I can tell, it should be very easy to add and I can do tomorrow if you like.

future = FALSE should be possible by specifying the time values you'd like to generate, via generate(<mable>, new_data = <tsibble>). In theory, this should allow you to request sampled values/paths for specific time points, however individual model support for this detail is lacking (more history of the fit is required, which would also be useful for speeding up hfitted() by avoiding refitting).

robjhyndman commented 3 years ago

Sounds good. Yes, I think the default new_data should be the same period as the training data for the STL model.

mitchelloharawild commented 3 years ago

I've added a simple generate() method which generates new data by block bootstrapping the STL decomposition's remainder. I think the default new_data for generate() should be consistent (and handled by fabletools), which currently defaults to the default forecast horizon. As the STL() decomposition lacks an out-of-sample prediction method, this should raise an error or a warning with generated NA_real_ values.

I also think that block bootstrapping the residuals could be used by all generate() methods, what do you think of that?

Further, are bagged models (https://github.com/tidyverts/fabletools/issues/217) only needing a simulation method for the original data, or do they specifically need to be (block) bootstrapped from a decomposition?

mitchelloharawild commented 3 years ago

I also think that block bootstrapping the residuals could be used by all generate() methods, what do you think of that?

Additionally, is the method for bootstrapping residuals() for generate() always the same? If so, perhaps this part of the code (and blocked bootstraps) can be brought up to the fabletools method and pass the bootstrapped results via .innov.

Then, generate() methods for models would only need to generate innovations if they aren't provided, and then use the innovations to generate paths.

robjhyndman commented 3 years ago

Thanks. The reason we block bootstrap for STL is that the decomposition generally does not lead to white noise residuals whereas all forecasting models should have white noise residuals. So I don't think we want block bootstraps by default, although it would certainly be useful to have them in general. Perhaps we could have a block bootstrap with blocks of size 1 by default for all methods including STL, and then have the user define the block size for STL as required. That requires more work by the user for STL but leaves the other methods working as they are by default.

Bagged models need a way to simulate data similar to the real data. Doing it via a decomposition has been shown to work well. Doing it in any other way has never been tested afaik, and it is not clear to me that it would work.

Bootstrapping of residuals could be done via regular bootstrapping, block bootstrapping, or via a stationary model applied to the residuals. But in practice, regular and block bootstrapping are much more common. It could be useful to have a block bootstrap function in fabletools for general application.

mitchelloharawild commented 3 years ago

Each generate() method also needs a way of producing its own innovations. Would it be reasonable for generate(<STL>) to use a block bootstrap to produce these innovations, or is there a more 'correct' (model appropriate) method for this?

robjhyndman commented 3 years ago

Block bootstrap is the best approach for STL.

mitchelloharawild commented 3 years ago

Added in c3f9957c82f32545e76200bd6fd58b7ac64eaf40