wbnicholson / BigVAR

Dimension Reduction Methods for Multivariate Time Series
56 stars 17 forks source link

python README sample: not repeatable #22

Closed extrospective closed 3 years ago

extrospective commented 3 years ago

The rolling_validate(mod) line fails for various sets of data. I found this with my own data, but ultimately also with the README sample code.

The sample code does not presently seed the random generator and therefore MultVARSim is generating different arrays and the output varies. Setting the seed value explicitly shows that rolling_validate(mod) will fail on many seed values.

I wrapped the code in a loop where I set the random seed each iteration (I had started lower but then adjusted the seed range to have one success at seed=5 before a failure at seed=6):

import numpy as np
from BigVAR.BigVARSupportFunctions import MultVARSim, CreateCoefMat
from BigVAR.BigVARClass import BigVAR,rolling_validate

k=3;p=4

# example coefficient matrix
B1=np.array([[.4,-.02,.01],[-.02,.3,.02],[.01,.04,0.3]])
B2=np.array([[.2,0,0],[0,.3,0],[0,0,0.13]])
B=np.concatenate((B1,B2),axis=1)
B=np.concatenate((B,np.zeros((k,2*k))),axis=1)
#print(B)
A=CreateCoefMat(B,p,k)

for sd in np.arange(5,20):
    print(f'Random seed {sd}')
    np.random.seed(sd)
    Y=MultVARSim(A,p,k,0.01*np.identity(3),T=500)
    VARX={}

    # construct BigVAR object:
    # Arguments:
    # Y T x k multivariate time series
    # p: lag order
    # penalty structure (only Basic and BasicEN supported)
    # granularity (depth of grid and number of gridpoints)
    # T1: Start of rolling validation
    # T2: End of rolling validation
    # alpha: elastic net alpha candidate
    # VARX: VARX specifications as dict with keys k (number of endogenous series), s (lag order of exogenous series)

    mod=BigVAR(Y,p,"Basic",[50,10],50,80,alpha=0.4,VARX=VARX)

    res=rolling_validate(mod)

    # coefficient matrix
    res.B

    # out of sample MSFE
    res.oos_msfe

    #optimal lambda
    res.opt_lambda
    print('-------------')

The failure when seed=6 is:

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-9-f464d9f4ab81> in <module>
     34     mod=BigVAR(Y,p,"Basic",[50,10],50,80,alpha=0.4,VARX=VARX)
     35 
---> 36     res=rolling_validate(mod)
     37 
     38     # coefficient matrix

e:\repo\bigvar\python\BigVAR\BigVARClass.py in rolling_validate(self)
    208     oos_aic = eval_ar(Y, T2, Z1.shape[1], 'aic', p, loss)
    209 
--> 210     oos_bic = eval_ar(Y, T2, Z1.shape[1], 'bic', p, loss)
    211 
    212     oos_mean = eval_mean(Y, T2, Z1.shape[1], loss, p)

e:\repo\bigvar\python\BigVAR\BigVARSupportFunctions.py in eval_ar(Y, T1, T2, ic, p, loss)
    142         mod = var_mod.fit(maxlags=p, ic=ic)
    143         lag_order = mod.k_ar
--> 144         yhat = mod.forecast(trainY[-lag_order:], 1)
    145         MSFE_temp = calc_loss(Y[u+p, :]-yhat, loss)
    146         MSFE.append(MSFE_temp)

d:\Anaconda3\envs\py39\lib\site-packages\statsmodels\tsa\vector_ar\var_model.py in forecast(self, y, steps, exog_future)
   1083         else:
   1084             exog_future = np.column_stack(exogs)
-> 1085         return forecast(y, self.coefs, trend_coefs, steps, exog_future)
   1086 
   1087     # TODO: use `mse` module-level function?

d:\Anaconda3\envs\py39\lib\site-packages\statsmodels\tsa\vector_ar\var_model.py in forecast(y, coefs, trend_coefs, steps, exog)
    227     """
    228     p = len(coefs)
--> 229     k = len(coefs[0])
    230     # initial value
    231     forcs = np.zeros((steps, k))

IndexError: index 0 is out of bounds for axis 0 with size 0

If I restart the kernel and start with seed=6 it fails with this error on the first iteration, so it does not seem to be due to state.

wbnicholson commented 3 years ago

This is failure due to the VAR selecting a lag order of zero which causes an error in statsmodels.tsa.api.VAR.forecast. It has been corrected in https://github.com/wbnicholson/BigVAR/commit/8fd3944ea5ac29b6db10a875ec4bbcb7114fa180.