pymc-devs / pymc-experimental

https://pymc-experimental.readthedocs.io
Other
72 stars 46 forks source link

Persistent issue with state disturbance variances versus state disturbance standard deviations #327

Closed rklees closed 2 months ago

rklees commented 2 months ago

I believe there is still an issue in pymc-experimental with the definition of disturbance variances, at least when using pymc-experimental to generate a synthetic dataset. In my test I defined a synthetic dataset consisting of an integrated random walk (smooth trend) with trend disturbance standard deviation of sigma_trend = 0.01, two cycles, and measurement noise. Thereafter, I run statsmodels unobserved components using the same model components. Statsmodels provides a ML estimate of the trend disturbance VARIANCE of sigma2.trend = 0.0135 (very close to the trend disturbance STANDARD DEVIATION I set to generate the synthetic dataset). When I solve a restricted problem with sigma2.trend = 0.0001 (which is sigma_trend^2), I get a much too smooth trend, far off the true trend. The code to generate the synthetic dataset is:

obs_noise = st.MeasurementError(name="obs") IRW = st.LevelTrendComponent(order=2, innovations_order=[0, 1]) cycle1 = st.CycleComponent(name="annual_cycle", cycle_length=12, innovations=True) cycle2 = st.CycleComponent(name="SA_cycle", cycle_length=5.347, innovations=True)

param_dict = { "initial_trend": np.zeros((2,)), "sigma_trend": np.array([1e-2]), "annual_cycle": np.array([0., 10.]), "sigma_annual_cycle": np.array([1e-6]), "SA_cycle": np.array([20., 0.]), "sigma_SA_cycle": np.array([1e-6]), "sigma_obs": np.array([10.]), }

The model is build with using statsmodels:

frequencies = [{'period': 12, 'harmonics': 1}]

model_settings = { 'irregular': True, 'level': True, 'stochastic_level': False, 'trend': True, 'stochastic_trend': True, 'cycle': True, 'damped_cycle': False, 'stochastic_cycle': False, 'cycle_period_bounds': [3., 20.], 'freq_seasonal': frequencies, 'stochastic_freq_seasonal': [False], 'use_exact_diffuse': False, }

model = sm.tsa.UnobservedComponents(data['meas'], **model_settings)

And the restricted model, which gives results very close to the true model, is

with model.fix_params({'sigma2.trend': 0.01}): result_constrained = model.fit()