statsmodels / statsmodels

Statsmodels: statistical modeling and econometrics in Python
http://www.statsmodels.org/devel/
BSD 3-Clause "New" or "Revised" License
10.07k stars 2.88k forks source link

VARMAX simulate method ignores `exog` parameter #8944

Closed fcela closed 1 year ago

fcela commented 1 year ago

When I simulate a VARMAX model with exogenous covariates, I get the error ValueWarning: Exogenous array provided, but additional data is not required.exogargument ignored. This doesn't seem correct. I was expecting a behavior similar to the simulate method in SARIMAX.

Code snippet to reproduce:

import numpy as np
import pandas as pd
from statsmodels.tsa.statespace.varmax import VARMAX

N = 100
x = np.random.normal(size=N)
e1 = np.random.normal(size=N)
e2 = np.random.normal(size=N)

y1 = x*0.0
y2 = x*0.0

for t in range(1,N):
    y1[t] = 0.5*x[t] + -0.1*y1[t-1] + e1[t]/100
    y2[t] = 0.25*x[t] + 0.25*y1[t-1] + e2[t]/100

train_test = pd.DataFrame({
    "x": x,
    "y1" : y1,
    "y2" : y2
})

model = VARMAX(
    endog = train_test[["y1", "y2"]].head(70), 
    exog = train_test[["x"]].head(70),
    enforce_stationarity = False,
    order=(1, 0), 
    trend='n', 
    error_cov_type="diagonal"
)
fitted_model = model.fit_constrained(constraints={'L1.y2.y1':0, 'L1.y2.y2': 0})
fitted_model.summary()
fitted_model.simulate(30, exog=train_test[["x"]].tail(30))

Output is:

RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            6     M =           10
 This problem is unconstrained.

At X0         0 variables are exactly at the bounds

At iterate    0    f= -1.30252D+00    |proj g|=  4.07593D+01

At iterate    5    f= -2.65903D+00    |proj g|=  4.65846D+00

At iterate   10    f= -2.75976D+00    |proj g|=  2.36165D-01

At iterate   15    f= -2.76032D+00    |proj g|=  6.58599D-03

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    6     17     25      1     0     0   1.802D-04  -2.760D+00
  F =  -2.7603183243356941     

CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH             
/usr/local/lib/python3.10/site-packages/statsmodels/tsa/statespace/mlemodel.py:1783: ValueWarning: Exogenous array provided, but additional data is not required. `exog` argument ignored.
  warnings.warn('Exogenous array provided, but additional data'
Out[6]: 
          y1        y2
0   0.535014  0.462652
1   0.629268  0.451893
2  -0.159420  0.063097
3  -0.758637 -0.424261
4   0.302065 -0.046738
5   0.217283  0.118468
6  -0.110604  0.019703
7   0.256496 -0.033961
8   0.768549  0.371450
9   0.432466  0.504444
10 -0.380967 -0.029317
11 -0.375560 -0.242707
12 -0.105171 -0.156479
13 -0.581976 -0.298553
14  0.434974  0.033800
15 -0.776099 -0.231594
16 -0.615464 -0.509869
17  0.198799 -0.097703
18  0.196656  0.162539
19  0.022403 -0.031081
20  0.076672  0.028445
21 -0.059608 -0.026798
22 -0.395845 -0.312222
23 -0.665727 -0.465525
24  0.240130 -0.058407
25 -0.534118 -0.168590
26 -0.460821 -0.409220
27  0.744238  0.261753
28 -0.026358  0.163851
29  0.600574  0.227698
INSTALLED VERSIONS
------------------
Python: 3.10.11.final.0
OS: Linux 4.18.0-372.46.1.el8_6.x86_64 #1 SMP Thu Feb 16 13:46:57 EST 2023 x86_64
byteorder: little
LC_ALL: None
LANG: C.UTF-8

statsmodels
===========

Installed: 0.13.5 (/usr/local/lib/python3.10/site-packages/statsmodels)

Required Dependencies
=====================

cython: 0.29.33 (/usr/local/lib/python3.10/site-packages/Cython)
numpy: 1.23.5 (/usr/local/lib/python3.10/site-packages/numpy)
scipy: 1.10.1 (/usr/local/lib/python3.10/site-packages/scipy)
pandas: 1.5.3 (/usr/local/lib/python3.10/site-packages/pandas)
    dateutil: 2.8.2 (/usr/local/lib/python3.10/site-packages/dateutil)
patsy: 0.5.3 (/usr/local/lib/python3.10/site-packages/patsy)

Optional Dependencies
=====================

matplotlib: 3.7.1 (/usr/local/lib/python3.10/site-packages/matplotlib)
    backend: module://matplotlib_inline.backend_inline 
cvxopt: Not installed
joblib: 1.2.0 (/usr/local/lib/python3.10/site-packages/joblib)

Developer Tools
================

IPython: 7.34.0 (/usr/local/lib/python3.10/site-packages/IPython)
    jinja2: 3.1.2 (/usr/local/lib/python3.10/site-packages/jinja2)
sphinx: Not installed
    pygments: 2.14.0 (/usr/local/lib/python3.10/site-packages/pygments)
pytest: 6.2.5 (/usr/local/lib/python3.10/site-packages/pytest)
virtualenv: 20.21.0 (/usr/local/lib/python3.10/site-packages/virtualenv)
fcela commented 1 year ago

looking into the source code, it seems I had forgotten that anchor defaults to 0, not to end; replacing the last line with the following I get the desired result

fitted_model.simulate(30, anchor='end', exog=train_test[["x"]].tail(30))