RJT1990 / pyflux

Open source time series library for Python
BSD 3-Clause "New" or "Revised" License
2.1k stars 240 forks source link

Example with multiple exogenous variables #29

Closed fccoelho closed 7 years ago

fccoelho commented 7 years ago

Suppose I want to forecast a series, using itself plus multiple other series as exogenous variables. Is it possible? can you provide a simple example?

RJT1990 commented 7 years ago

Sure. There are two ways to do this:

  1. Explicit multivariate models - only VAR models at the moment in this library. You would enter a dataframe with the multiple series, specify the lag order, then you can do multivariate prediction straight from there.
  2. Implicit multivariate models - if there is one series in particular you are interested in, then first choose a model that allows you to include exogenous variables, such as:
    • ARIMAX model
    • GASX model
    • GASReg model
    • Dynamic Linear Regression model (Gaussian state space)
    • Dynamic Non-Linear Regression model (Non-Gaussian state space)

And include the other series as your predictors via patsy notation (see examples). The tricky bit is how to do prediction with these exogenous variables - you'll need to forecast the exogenous variables (e.g. estimate univariate models on these and get predictions via the predict() method). Once you have the predictions for the exogenous variables, put them into a dataframe. Then use this dataframe in the predict() method of your main model as the oos_data() argument.

Appreciate the answer above is a bit verbose. TLDR answer is: make forecasts of the exogenous variables, make a dataframe out of them, and include the dataframe as the data_oos argument in predict().

fccoelho commented 7 years ago

Thanks! it really helped.

RJT1990 commented 7 years ago

No worries at all.