I believe there is still an issue in pymc-experimental with the definition of disturbance variances, at least when using pymc-experimental to generate a synthetic dataset. In my test I defined a synthetic dataset consisting of an integrated random walk (smooth trend) with trend disturbance standard deviation of sigma_trend = 0.01, two cycles, and measurement noise. Thereafter, I run statsmodels unobserved components using the same model components. Statsmodels provides a ML estimate of the trend disturbance VARIANCE of sigma2.trend = 0.0135 (very close to the trend disturbance STANDARD DEVIATION I set to generate the synthetic dataset). When I solve a restricted problem with sigma2.trend = 0.0001 (which is sigma_trend^2), I get a much too smooth trend, far off the true trend. The code to generate the synthetic dataset is:
I believe there is still an issue in pymc-experimental with the definition of disturbance variances, at least when using pymc-experimental to generate a synthetic dataset. In my test I defined a synthetic dataset consisting of an integrated random walk (smooth trend) with trend disturbance standard deviation of sigma_trend = 0.01, two cycles, and measurement noise. Thereafter, I run statsmodels unobserved components using the same model components. Statsmodels provides a ML estimate of the trend disturbance VARIANCE of sigma2.trend = 0.0135 (very close to the trend disturbance STANDARD DEVIATION I set to generate the synthetic dataset). When I solve a restricted problem with sigma2.trend = 0.0001 (which is sigma_trend^2), I get a much too smooth trend, far off the true trend. The code to generate the synthetic dataset is:
obs_noise = st.MeasurementError(name="obs") IRW = st.LevelTrendComponent(order=2, innovations_order=[0, 1]) cycle1 = st.CycleComponent(name="annual_cycle", cycle_length=12, innovations=True) cycle2 = st.CycleComponent(name="SA_cycle", cycle_length=5.347, innovations=True)
param_dict = { "initial_trend": np.zeros((2,)), "sigma_trend": np.array([1e-2]), "annual_cycle": np.array([0., 10.]), "sigma_annual_cycle": np.array([1e-6]), "SA_cycle": np.array([20., 0.]), "sigma_SA_cycle": np.array([1e-6]), "sigma_obs": np.array([10.]), }
The model is build with using statsmodels:
frequencies = [{'period': 12, 'harmonics': 1}]
model_settings = { 'irregular': True, 'level': True, 'stochastic_level': False, 'trend': True, 'stochastic_trend': True, 'cycle': True, 'damped_cycle': False, 'stochastic_cycle': False, 'cycle_period_bounds': [3., 20.], 'freq_seasonal': frequencies, 'stochastic_freq_seasonal': [False], 'use_exact_diffuse': False, }
model = sm.tsa.UnobservedComponents(data['meas'], **model_settings)
And the restricted model, which gives results very close to the true model, is
with model.fix_params({'sigma2.trend': 0.01}): result_constrained = model.fit()