Closed jonekeat closed 3 years ago
I certainly would like to support more models, and structural models is planned. This may be added via a new extension package rather than included in the fable package.
Thanks for the prompt reply, I am looking forward for the new extension package
Related to the question above:
is there any functionality (planned) to allow the user add models her/himself? I refer to the caret or mlr package. Both of them have a unified way of training models and the user can easily add models, adhering to the package design.
Yes - fable is extensible by design. The fable package itself is an extension (only providing some common forecasting models). The framework and tooling to make it all work is provided in the fabletools package.
Currently the interface is a bit too experimental (especially in the back end) for me to encourage users adding models themselves, however once things are more refined I will be writing extension vignette(s) (https://github.com/tidyverts/fabletools/issues/15).
There are currently a couple of examples which show how extension packages can be written:
A model is made with a few components:
forecast.xyz
, residuals.xyz
, glance.xyz
, components.xyz
, generate.xyz
, etc.)If you'd like to write an extension model, let me know and I'd be happy to work with you and refine this model development interface further.
I've spoken a little bit to Steven Scott (the author of bsts) about a formula-based implementation of bsts as a fable extension. I have the framework laid out for that (though including priors is a bit tricky at times). Because the creation of the state specification is similar to the method of specification for prophet, I have just been following the methodology for your implementation of fable.prophet, but that makes it a bit hard to test things as I go along with an implementation.
I would love some guidance or assistance, as I'm just working on it in my spare spare time and would love to be a bit more efficient.
Wow, looks great! I'd be very happy to help with this as an extension package. I think collaboration here would lead to a great extension package, and also improvements to how a model is developed for fable. Is there something specific you're having trouble with for testing? If there's anything you get stuck with let me know.
I'm very happy to let my struggles help others! I would love to eventually also get forecTheta or TSA::arimax() wrapped up, if no one is planning on opening them up and redoing them, under the hood. My biggest issue is actually my own methodology, which included just trying to swap things out from fable.prophet, without having really understood the parsing of the formulas and the specification of specials. The state specification for bsts is just a list, and can have multiple trends added to it, as well as the wrong forms of priors added to it, which need to be constrained. If I'm honest, it's because I'm a statistician more than a computer scientist, and I'm wanting to do this in order to become better at thinking like a computer scientist.
Theta models are planned for inclusion in fable (#41). Currently it is planned to implement forecast::thetaf()
, but presumably I will support forecTheta
in its generality. The hold up for this is the forecast distribution of its equivalent stochastic model (https://doi.org/10.1016/S0169-2070(01)00143-1), which will require a more general implementation of distributions (like https://github.com/alexpghayes/distributions3, but vectorised and more flexible). (this is a hold up for croston's method, not theta.)
ARIMAX is very similar to what we already have in ARIMA()
which supports exogenous regressors (giving 'regression with ARMA errors').
As for TSA::arimax()
, this model is more closely described as a transfer function model (https://robjhyndman.com/hyndsight/arimax/). It is planned to support this model class later, but the current priority is on model parity with the forecast package.
Regarding specials and formula parsing, here is a rough idea of what happens:
transform(y) ~ trend() + trend() + season() + x
)specials
argument. In the above formula, there are two trend specials, one season special, and x
(which is passed to the xreg()
special). If you need to constrain the usage of specials (such as only one season()
) this can be done in the training function by checking the lengths of the specials
argument.Also, I think being a statistician more than a compsci is great here. Part of the fable design goal (which clearly needs work) is to make it easy for statisticians with new models to create a package that integrates well with other models (via combinations, reconciliation, etc.).
The top makes perfect sense to me. My desire for TSA::arimax() is entirely because it allows the specification of a transfer function. I assumed all of forecast would find itself in fable, and am excited about it. Does it seem like someone putting time against the task of vectorizing distributions3 would be helpful? I ask because the author seems pretty open (from a glance at the closed PRs) and it's not an onerous task.
Thanks, that helps, especially with knowing what I don't have to think about on the left. I just went and looked at the list structure once the model is specified, and I've got a solid idea on how to constrain things in terms of how many of a type of special to allow. It'll include multiple season() arguments sometimes, but not multiple trend() arguments. I still need to think a little about whether or not to constrain priors that are fed into a special to only those generating functions that "seem" to make sense.
I just see the word reconciliation and I get all excited. reconcilethief is one of my current minor obsessions, though not used with thief. I'm very glad that's coming.
I've been working on creating vectorised distributions, which might get merged into distributions3 or become a standalone package (https://github.com/tidyverts/fabletools/issues/123).
Specifically for the theta model, the distributions would also need zero-inflated variants. Additionally for fable's transformations, this would then require transformed zero inflated distributions. Extending the distributions3 package to support this isn't trivial (and I'm currently approaching it as a rewrite).
Concerning additional distributions, like zero-inflated and the like, the gamlss package in general and the gamlss.dist package in particular might be of interest
https://cran.r-project.org/web/packages/gamlss.dist/index.html
The gamlss.dist package provides a set of distributions which can be used for modelling the response variables in Generalized Additive Models for Location Scale and Shape, Rigby and Stasinopoulos (2005). The distributions can be continuous, discrete or mixed distributions. Extra distributions can be created, by transforming, any continuous distribution defined on the real line, to a distribution defined on ranges 0 to infinity or 0 to 1,
How can I interrogate a specials
list as it would arrive in the training function? I don't know how multiple specials of the same name (e.g. value ~ trend() + trend()
) would show up. I'm guessing I would find their specifications in specials$trend[[1]]
and specials$trend[[2]]
, but not entirely sure.
Yes, that's correct.
It would also be in the same list structure for value ~ trend()
, accessed via specials$trend[[1]]
.
This may be already available and I'm just missing it. Is there a way to take a model specification and create the output that is provided to the train_*()
function? That way, it would be easy to create a dummy workflow. A parameter to the fabletools::model()
function that stopped before actually fitting a model and just outputs as.list(environment())
would work, I would think. It gives the creator of the extension a simple way to investigate exactly what they're working with in the train_*()
step.
It might also be worth having a diagram that shows exactly what gets passed to a special like xreg
based on a model specification that just includes the variable name. Or, in general, a diagram that shows what data is available (and the naming convention for it) at any given point in the process. I'm happy to help work on that, but I haven't taken the time to try and take apart fabletools
enough to take a swag yet.
Sorry for the late reply on this one. The function that calls the train_*()
function is estimate.tbl_ts()
, so you should be able to debug()
this to see what is going on (including parsing specials).
I've also started writing the vignette for adding models with fabletools: https://fabletools.tidyverts.org/dev/articles/extension_models.html
The process of creating a model is outlined, and tomorrow I'll be writing more about adding methods. Your thoughts on the vignette would be great, especially for points of confusion that you've experienced while writing fable.bsts.
Thank you both for the discussion. I'm the opposite of David (more compsci than statistician), but things were still a bit opaque. Reading this and the extensions article I've now managed to imeplement my first custom model :tada:
Fable seriously rocks! (and to think this is only a "beta" version, what more goodies are to come)
Great to hear @Fuco1! Is the extension model open source by any chance? It would be nice to see what you've done with fable.
@mitchelloharawild Not open source at the moment, it is rather simple though, basically encoding a ton of business rules that we've come up with over the years. It's not very scientific but works well in practice.
I was wondering is there any future plan to add support for the Bayesian Structural Time Series model as provided by Google bsts package in fable? There is a blog introducing this package: http://www.unofficialgoogledatascience.com/2017/07/fitting-bayesian-structural-time-series.html
Thanks for all your hard work, it is really excited to see an unified interface for time series modelling in R.
Here is a tutorial where using python
and prophet
, might try to use reticulate
to call the Bayesian model.
https://m.youtube.com/watch?v=jo12CWZ00Lo
Closing as this model will not be added into fable, but can be made available via an extension package. This tracker has been consolidated into https://github.com/tidyverts/fable/issues/344
I was wondering is there any future plan to add support for the Bayesian Structural Time Series model as provided by Google bsts package in fable? There is a blog introducing this package: http://www.unofficialgoogledatascience.com/2017/07/fitting-bayesian-structural-time-series.html
Thanks for all your hard work, it is really excited to see an unified interface for time series modelling in R.