facebook / prophet

Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.
https://facebook.github.io/prophet
MIT License
18.53k stars 4.54k forks source link

Mutli-indexed covariates for Prophet Forecast #2161

Open mcbaron opened 2 years ago

mcbaron commented 2 years ago

Hello all,

I believe I'm trying to do something very similar to issue 951 , so maybe @bletham might be able to weigh in on this one. (I actually spoke with @veganveins about this, which is how I got here).

Let me start with some definitions so that it is clear what I'm asking. I understand from this notebook that we are able to incorporate what I would call "covariates" to construct a multivariate a prophet model with the .add_regressor functionality. In the case of the NZ bike traffic they have temperature, rainfall, wind, etc as covariates, all indexed by ds, the same as our target time series.

I think that prophet is not currently setup to enable what I would call multi-channel forecasting, where the same set of covariates give rise to multiple time series. (My background is audio signal processing, so the mental model I have is akin to the MIMO paradigm). In order to do this, one must construct an array of prophet models, one for each receive "channel".

My question, which I think is related to Issue 951, is that I want to integrate covariates which are indexed by more than just ds, the domain of the covariate is channel-timestamp. Let's get into an example:

# example code for multi-index covariates:
dates = pd.date_range('2019-01-01', periods=52, freq='W')
store_index = pd.Series(["{:02d}".format(x) for x in range(21)])
df_index = pd.MultiIndex.from_product([dates, store_index], names=["DS", "STORE"])
df = pd.DataFrame(data=[np.random.rand(df_index.shape[0]), np.random.rand(df_index.shape[0])], index=df_index, columns=["FEATURE", "REVENUE"])

For this example, I'd like to predict REVENUE by week (summed over all stores), but include FEATURE as a covariate. I can't simply construct a prophet model and mdl.add_regressor("FEATURE") because the domain of this feature is the cartesian product DS x STORE, not simply DS.

Is the solution to train a separate prophet model for each STORE, and then aggregate across that axis? What is the proper way to architect this?

Thanks!

tcuongd commented 2 years ago

Yeah I think you either want to model each store separately, or have multiple regressors, one for each store (although I'm guessing this will lead to overfitting).