Closed bernardoct closed 1 year ago
Hi @bernardoct , thanks for reaching out. I agree that the exception message can be a lot cleaner, we will take this into consideration for our next release. Having said that, this is a deliberate design choice. Let me explain.
You have pointed out correctly that the exception comes from the fact that the fut_df
is empty. The underlying reason is that you need to provide future regressor values if using regressors. Take a look at the first diagram here for Greykite behavior: https://github.com/linkedin/greykite/blob/master/docs/pages/stepbystep/0300_input.rst
Internally we compute train_end_date
as the last date with a non-null value in the y
column. The regressor needs to be available for at least train_end_date
+ forecast_horizon
.
In our library, there are 2 ways to facilitate this:
train_end_date
in MetadataParam
. All y
values after this are ignored after this date.y
as NA.We do not allow training and forecasting on the same data as part of the Forecaster
. However,
backtest
from the Forecaster
result.trained_model
from the Forecaster
result, and predict on an arbitrary data frame, provided it has the appropriate columns.
I'm running into an issue in which
run_forecast_config
crashes when I add regressors but doesn't crash if I don't do so. The error is because of an emptyfut_df
, which I understand to be the data frame with time and regressors against which to run the forecast. The reason for the emptyfut_df
is because the methodmake_future_dataframe
(see callstack below) is returning zero length on line 349 ofunivariate_time_series.py
for there not being missing values in they
column. I don't know if this was done by design, but if my assessment is correct it would probably be a good idea to either allow the model to run on train data only or raise an exception that explains that one can't have an input data frame with regressors and no missing values iny
for forecasting. See my call stack and code below.Error
Location of code causing
fut_df
to have zero rows:My code
Test data