antoinecarme / pyaf

PyAF is an Open Source Python library for Automatic Time Series Forecasting built on top of popular pydata modules.
BSD 3-Clause "New" or "Revised" License
458 stars 73 forks source link

Double check exogenous data implementation #106

Closed antoinecarme closed 5 years ago

antoinecarme commented 5 years ago

When playing with an example from FPP2 book., I get some errors and was unable to build a model.

source : https://otexts.org/fpp2/lagged-predictors.html

import pandas as pd
import pyaf.ForecastEngine as autof

df = pd.read_csv("https://raw.githubusercontent.com/antoinecarme/TimeSeriesData/master/fpp2/insurance.csv")

(lTimeVar , lSigVar , lExogVar) = ("Index", "Quotes" , "TV.advert")
df_sig = df[[lTimeVar , lSigVar]]
df_exog = df[[lTimeVar , lExogVar]] # need time here
H = 4

lEngine = autof.cForecastEngine()
lEngine.mOptions.set_active_autoregressions(['ARX'])

lExogenousData = (df_exog , [lExogVar]) 
lEngine.train(df_sig , lTimeVar , lSigVar, H, lExogenousData);

lEngine.getModelInfo();
antoinecarme commented 5 years ago

There is some issue when the date variable is float.

TypeError: Cannot compare type 'Timestamp' with type 'float'
antoinecarme commented 5 years ago

The date comparison in exogenous data was working only when the time a physical time (date or datetime).

Corrected.

antoinecarme commented 5 years ago

The script is now running :

new log :

INFO:pyaf.std:START_TRAINING 'Quotes'
INFO:pyaf.std:END_TRAINING_TIME_IN_SECONDS 'Quotes' 1.306643009185791
INFO:pyaf.std:TIME_DETAIL TimeVariable='Index' TimeMin=2002.0 TimeMax=2004.25 TimeDelta=0.08333333333333333 Horizon=4
INFO:pyaf.std:SIGNAL_DETAIL_ORIG SignalVariable='Quotes' Min=8.394680000000001 Max=18.438979999999997  Mean=13.604347 StdDev=2.369165266733412
INFO:pyaf.std:SIGNAL_DETAIL_TRANSFORMED TransformedSignalVariable='_Quotes' Min=8.394680000000001 Max=18.438979999999997  Mean=13.604347 StdDev=2.369165266733412
INFO:pyaf.std:BEST_TRANSOFORMATION_TYPE '_'
INFO:pyaf.std:BEST_DECOMPOSITION  '_Quotes_ConstantTrend_residue_zeroCycle_residue_ARX(10)' [ConstantTrend + NoCycle + ARX]
INFO:pyaf.std:TREND_DETAIL '_Quotes_ConstantTrend' [ConstantTrend]
INFO:pyaf.std:CYCLE_DETAIL '_Quotes_ConstantTrend_residue_zeroCycle' [NoCycle]
INFO:pyaf.std:AUTOREG_DETAIL '_Quotes_ConstantTrend_residue_zeroCycle_residue_ARX(10)' [ARX]
INFO:pyaf.std:MODEL_MAPE MAPE_Fit=0.0773 MAPE_Forecast=0.0849 MAPE_Test=0.11
INFO:pyaf.std:MODEL_SMAPE SMAPE_Fit=0.075 SMAPE_Forecast=0.0927 SMAPE_Test=0.1191
INFO:pyaf.std:MODEL_MASE MASE_Fit=0.6987 MASE_Forecast=0.8417 MASE_Test=0.7994
INFO:pyaf.std:MODEL_L1 L1_Fit=0.9626864830078556 L1_Forecast=1.1534761793879782 L1_Test=1.8425531526283625
INFO:pyaf.std:MODEL_L2 L2_Fit=1.1670717218220485 L2_Forecast=1.6085552731482886 L2_Test=2.1866032428022977
INFO:pyaf.std:MODEL_COMPLEXITY 7
INFO:pyaf.std:AR_MODEL_DETAIL_START
INFO:pyaf.std:AR_MODEL_COEFF 1 TV.advert_Lag1 -2.1005818786858557
INFO:pyaf.std:AR_MODEL_COEFF 2 _Quotes_ConstantTrend_residue_zeroCycle_residue_Lag1 1.7363372330908362
INFO:pyaf.std:AR_MODEL_COEFF 3 _Quotes_ConstantTrend_residue_zeroCycle_residue_Lag2 -0.2717817063855942
INFO:pyaf.std:AR_MODEL_COEFF 4 TV.advert_Lag9 -0.19440202823219743
INFO:pyaf.std:AR_MODEL_COEFF 5 TV.advert_Lag10 -0.12693071971380282
INFO:pyaf.std:AR_MODEL_COEFF 6 _Quotes_ConstantTrend_residue_zeroCycle_residue_Lag9 0.12521608597557266
INFO:pyaf.std:AR_MODEL_COEFF 7 _Quotes_ConstantTrend_residue_zeroCycle_residue_Lag10 -0.020224752260123477
INFO:pyaf.std:AR_MODEL_DETAIL_END
antoinecarme commented 5 years ago

ARX model now OK. Added a test SVRX not working. Added a failing test XGBX model OK.

antoinecarme commented 5 years ago

ARX Model:

image

antoinecarme commented 5 years ago

SVRX Model :

image

antoinecarme commented 5 years ago

XGBX Model :

image

antoinecarme commented 5 years ago

ARX Model in slow mode (all possible models activated + cross validation)

image

antoinecarme commented 5 years ago

Fixed