facebook / prophet

Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.
https://facebook.github.io/prophet
MIT License
18.02k stars 4.48k forks source link

Does Prophet require feature engineering for regressors? #2485

Open carlosg-m opened 10 months ago

carlosg-m commented 10 months ago

When applying Linear Regression or Autoreg to a timeseries it is usually beneficial to create new features based on exogenous variables to extract temporal information, for example lags, moving averages and so on.

1) Does Prophet require such a step or is there a mechanism in place to extract or account for temporal dependencies provided by regressors? 2) Do we need to standardize regressors with z-score? 3) Does Prophet have built-in feature selection for regressors?

carlosg-m commented 10 months ago

@AlexandroLuis please share some of your knowledge :)

AlexandroLuis commented 10 months ago

Hello @carlosg-m

  1. Prophet currently doesn't require you to manually build any features to account temporal seasonalities, instead it has a built-in mechanism to handle it, automatically detects seasonalities such as days, weeks and years, as well hollidays and events with no need for a feature engineering. Even if you are using regressors, prophet can handles the incorporation of exogenous variables and their temporal dependencies. Here's an example: 'regressor_1': [5, 8, 7, 9, 10] # exogenous variable model.add_regressor('regressor_1') # Add your exogenous variable to model Add regressor values for the future dates if i'ts known future['regressor_1'] = [11, 12, 13] I have specified the mode of seasonality, if i'ts a week and a year season and let the daily the default one, with is False.
  2. No, you don't need to normalize your regressor using z-scores, prophet currently does that for you.
  3. No, i't currently doesn't have a built-in feature selection for regressors, you'll need to do it manually. You could use SelectKBest from scikit-learn with the f_regression scoring, so can find the top 1 regressor.