JoaquinAmatRodrigo / skforecast

Time series forecasting with machine learning models
https://skforecast.org
BSD 3-Clause "New" or "Revised" License
1.08k stars 123 forks source link

Questions about using the known exogenous variables to conduct forecasted values #281

Closed kennis222 closed 1 year ago

kennis222 commented 1 year ago

Hi developers,

I have a little confusion about using the known exogenous variables to conduct forecasted values.
Firstly, I use the multi-series function to "predict" all the exogenous variables and the target output Y simultaneously. After that, I would like to use the 'predicted values' of the exogenous variables to conduct the direct/recursive forecasting for the target output Y, and I refer the related documents such as the example of weather exogenous variables. However, I have the confusion about dealing with the known future values in an appropriate place because they are the future known values and the format are "not consistent" with the known training data. How can I "combine" them together during using the skforecast framework?

kennis222 commented 1 year ago

Update:

I think I figured it out, but I am not sure whether I understand it well and correctly use it. For example, if I use the ForecasterAutoregDirect and grid_search_forecaster to train the model:

grid_search_forecaster( forecaster = forecaster_direct, y = output_df, exog = train_exogs, steps = 10)

After training and conducting the forecasts, I get the predicted_values, predicted_values = forecaster_direct.predict(exog=future_known_exogs) where future_known_exogs is forecasted by multi-series function or the expected performance of the exogenous variables.

JavierEscobarOrtiz commented 1 year ago

Hello @kennis222,

You understand it correctly, let's summarize your steps.

  1. Suppose you have 2 series, Y and exog, and you want to predict the exog with multi-series forecasting.
forecaster_multiseries = ForecasterAutoregMultiSeries( #Declare with proper arguments )

forecaster_multiseries.fit(series=series)

Here series must be a pandas dataframe containing Yand exog. (Here we use exog original values).

  1. Predict n steps of the exog variable:
multiseries_predict_exog = forecaster_multiseries.predict(steps=10, level='exog_column_name')

This will return the following 10 steps of the exog. These are future values.

  1. Use new exog values in a direct forecaster:
forecaster_direct = ForecasterAutoregDirect( #Declare with proper arguments )

exog_train = series[exog] # Same exog as used in the fit of the forecaster_multiseries

forecaster_direct.fit(y=y, exog=exog_train)

forecaster_direct.predict(exog=multiseries_predict_exog) # Use predicted values 

Hope it helps!