Closed antoinecarme closed 4 years ago
First specification method : one exogenous data for all nodes : (dataframe , list of used variables)
def create_exog_data(b1):
# fake exog data based on date variable
lDate1 = b1.mPastData['Date']
lDate2 = b1.mFutureData['Date'] # not needed. exogfenous data are missing when not available.
lDate = lDate1.append(lDate2)
lExogenousDataFrame = pd.DataFrame()
lExogenousDataFrame['Date'] = lDate
lExogenousDataFrame['Date_second'] = lDate.dt.second
lExogenousDataFrame['Date_minute'] = lDate.dt.minute
lExogenousDataFrame['Date_hour'] = lDate.dt.hour
lExogenousDataFrame['Date_dayofweek'] = lDate.dt.dayofweek
lExogenousDataFrame['Date_day'] = lDate.dt.day
lExogenousDataFrame['Date_dayofyear'] = lDate.dt.dayofyear
lExogenousDataFrame['Date_month'] = lDate.dt.month
lExogenousDataFrame['Date_week'] = lDate.dt.week
# a column in the exog data can be of any type
lExogenousDataFrame['Date_day_name'] = lDate.dt.day_name()
lExogenousDataFrame['Date_month_name'] = lDate.dt.month_name()
lExogenousVariables = [col for col in lExogenousDataFrame.columns if col.startswith('Date_')]
lExogenousData = (lExogenousDataFrame , lExogenousVariables)
return lExogenousData
Second specification method : per-node exogenous data : lExogenous[signal] = (dataframe , list of used variables)
def create_exog_data(b1):
# fake exog data based on date variable
lDate1 = b1.mPastData['Date']
lDate2 = b1.mFutureData['Date'] # not needed. exogfenous data are missing when not available.
lDate = lDate1.append(lDate2)
lExogenousDataFrame = pd.DataFrame()
lExogenousDataFrame['Date'] = lDate
lExogenousDataFrame['Date_second'] = lDate.dt.second
lExogenousDataFrame['Date_minute'] = lDate.dt.minute
lExogenousDataFrame['Date_hour'] = lDate.dt.hour
lExogenousDataFrame['Date_dayofweek'] = lDate.dt.dayofweek
lExogenousDataFrame['Date_day'] = lDate.dt.day
lExogenousDataFrame['Date_dayofyear'] = lDate.dt.dayofyear
lExogenousDataFrame['Date_month'] = lDate.dt.month
lExogenousDataFrame['Date_week'] = lDate.dt.week
# a column in the exog data can be of any type
lExogenousDataFrame['Date_day_name'] = lDate.dt.day_name()
lExogenousDataFrame['Date_month_name'] = lDate.dt.month_name()
lExogenousVariables = [col for col in lExogenousDataFrame.columns if col.startswith('Date_')]
lExogenousData = {}
# define exog only for three state nodes
lExogenousData["NSW_State"] = (lExogenousDataFrame , lExogenousVariables[:3])
lExogenousData["VIC_State"] = (lExogenousDataFrame , lExogenousVariables[-3:])
lExogenousData["QLD_State"] = (lExogenousDataFrame , lExogenousVariables)
return lExogenousData
The M5 Competition
def get_exogenous_data(self, signal):
if(self.mExogenousData is None):
return None
# A signal is a hierarchy node
if(type(self.mExogenousData) == tuple):
# same data for all signals
return self.mExogenousData
if(type(self.mExogenousData) == dict):
# one exogenous data by signal
return self.mExogenousData.get(signal)
raise tsutil.PyAF_Error("BAD_EXOGENOUS_DATA_SPECIFICATION");
Closing.
Will be officially available in release 2.0
PyAF does not yet allow using exogenous data (explanatory variables) to enrich the models used in hierarchies.
Expect the possibility to define one exogenous data for all hierarchy nodes or setting a per-node exogenous data.