Different estimation results with the same model

tonyxia324 commented 5 years ago

Hi Junior,

After estimating a DSGE model multiple times, I found out that the estimation results can vary significantly after each estimation, even though it is always the same model with the same priors. I have posted two estimation results below, both of which are estimated from the same model. I'm not exactly certain what I can do to improve the consistency of my estimation. Is this a sign of not enough observations in my data? Thank you in advance!

Best, Tony

Result 1: MODEL ESTIMATION RESULTS distribution initval mode mode_std

tau                 BETA             0.62        0.39064         27564
kappa               GAMMA            0.32         3.1692         49604
alpha               BETA             0.13        0.83534          7391
rss                 BETA             0.02      0.0071788        3097.8
rhoq                BETA             0.36       0.021025        5945.7
rhoystar            BETA             0.95         0.8966         13561
rhoPaistar          BETA             0.41        0.20775         27703
rhoz                BETA             0.49        0.50455         66939
rhor                BETA             0.89        0.76227         69530
gammay              GAMMA            0.97      0.0043915         14466
gammaPai            GAMMA               3         1.4486    4.1211e+05
gammae              GAMMA           0.001      8.072e-11        1.3605
sigr                INV_GAMMA       0.002      0.0020311        57.578
sigq                INV_GAMMA        0.02       0.044974        438.65
sigystar            INV_GAMMA       0.005      0.0032994        325.08
sigPaistar          INV_GAMMA       0.005      0.0058258        7.0559
sigz                INV_GAMMA       0.019      0.0057679        46.802
sigy                INV_GAMMA       0.002      0.0030818        118.87
sigPai              INV_GAMMA       0.012       0.046394        4433.8
stderr_obspi        INV_GAMMA       0.012      0.0070002        607.39
stderr_obspistar    INV_GAMMA       0.012      0.0069991        6604.1
stderr_obse         INV_GAMMA       0.012         5.8826    3.0135e+05
stderr_obsq         INV_GAMMA       0.012      0.0069995        1237.8
                    ____________    _______    _________    __________

                    distribution    initval      mode        mode_std

log-post: -1020.5642 log-lik: -1020.8446 log-prior: 0.2803 log-endog_prior 0.0000 numberOfActiveInequalities 0 log-MDD(Laplace) -1098.5573010 estimation sample is: 2004Q1 : 2017Q3 (55 observations) solution algorithm is: rise_1 estimation algorithm is: fmincon number of estimated parameters is: 23 number of function evaluations is: 10128

start time: 07-Oct-2019 18:11:29 end time : 07-Oct-2019 18:17:40 total time: 0:6:11

List of issues

none

Result 2: MODEL ESTIMATION RESULTS distribution initval mode mode_std

tau                 BETA             0.62         0.18662    0+17241i     
kappa               GAMMA            0.32       0.0098385    0+1.4631e-13i
alpha               BETA             0.13         0.95356    0+3278.2i    
rss                 BETA             0.02       0.0023873    0+95.296i    
rhoq                BETA             0.36         0.12705    0+2580i      
rhoystar            BETA             0.95         0.69956    0+17309i     
rhoPaistar          BETA             0.41         0.01364    0+140.31i    
rhoz                BETA             0.49         0.74528    0+11680i     
rhor                BETA             0.89       0.0002951    0+3946.5i    
gammay              GAMMA            0.97         0.11389    0+18112i     
gammaPai            GAMMA               3         0.99301    0+1.042e+05i 
gammae              GAMMA           0.001      9.5279e-05    0+7.6502i    
sigr                INV_GAMMA       0.002       0.0058349    0+160.96i    
sigq                INV_GAMMA        0.02        0.046963    0+1709.2i    
sigystar            INV_GAMMA       0.005       0.0029245    0+191.66i    
sigPaistar          INV_GAMMA       0.005       0.0057202    0+61.089i    
sigz                INV_GAMMA       0.019       0.0042871    0+515.35i    
sigy                INV_GAMMA       0.002       0.0060382    0+186.88i    
sigPai              INV_GAMMA       0.012        0.045556    0+2762.4i    
stderr_obspi        INV_GAMMA       0.012       0.0041055    0+305.18i    
stderr_obspistar    INV_GAMMA       0.012       0.0062882    0+45.775i    
stderr_obse         INV_GAMMA       0.012          2.3369    0+3.9889e+05i
stderr_obsq         INV_GAMMA       0.012        0.024979    0+374.45i    
                    ____________    _______    __________    _____________

                    distribution    initval       mode         mode_std

log-post: -1247.7500 log-lik: -1259.3472 log-prior: 11.5973 log-endog_prior 0.0000 numberOfActiveInequalities 0 log-MDD(Laplace) -1377.5057001 estimation sample is: 2004Q1 : 2017Q3 (55 observations) solution algorithm is: rise_1 estimation algorithm is: fmincon number of estimated parameters is: 23 number of function evaluations is: 2068

start time: 07-Oct-2019 18:18:31 end time : 07-Oct-2019 18:19:39 total time: 0:1:9

List of issues

none

jmaih commented 5 years ago

Hi Tony,

This is to be expected and shows that: 1-) the objective function is multi-modal 2-) the optimizer you are using is not able to find the global peak... at least at your starting values.

One solution is to take the best of the peaks found. An even better solution is to use an optimizer that has better exploration and exploitation capabilities so that you stand a chance to find something that resembles the global peak.

Optimization is not an easy problem in general.

Keep up the good work,

J.

Sendt fra min iPhone

okt. 2019 kl. 12:22 skrev tonyxia324 notifications@github.com:

Hi Junior,

After estimating a DSGE model multiple times, I found out that the estimation results can vary significantly after each estimation, even though it is always the same model with the same priors. I have posted two estimation results below, both of which are estimated from the same model. I'm not exactly certain what I can do to improve the consistency of my estimation. Is this a sign of not enough observations in my data? Thank you in advance!

Best, Tony

Result 1: MODEL ESTIMATION RESULTS distribution initval mode mode_std

tau BETA 0.62 0.39064 27564 kappa GAMMA 0.32 3.1692 49604 alpha BETA 0.13 0.83534 7391 rss BETA 0.02 0.0071788 3097.8 rhoq BETA 0.36 0.021025 5945.7 rhoystar BETA 0.95 0.8966 13561 rhoPaistar BETA 0.41 0.20775 27703 rhoz BETA 0.49 0.50455 66939 rhor BETA 0.89 0.76227 69530 gammay GAMMA 0.97 0.0043915 14466 gammaPai GAMMA 3 1.4486 4.1211e+05 gammae GAMMA 0.001 8.072e-11 1.3605 sigr INV_GAMMA 0.002 0.0020311 57.578 sigq INV_GAMMA 0.02 0.044974 438.65 sigystar INV_GAMMA 0.005 0.0032994 325.08 sigPaistar INV_GAMMA 0.005 0.0058258 7.0559 sigz INV_GAMMA 0.019 0.0057679 46.802 sigy INV_GAMMA 0.002 0.0030818 118.87 sigPai INV_GAMMA 0.012 0.046394 4433.8 stderr_obspi INV_GAMMA 0.012 0.0070002 607.39 stderr_obspistar INV_GAMMA 0.012 0.0069991 6604.1 stderr_obse INV_GAMMA 0.012 5.8826 3.0135e+05 stderr_obsq INV_GAMMA 0.012 0.0069995 1237.8
                distribution    initval      mode        mode_std 
log-post: -1020.5642 log-lik: -1020.8446 log-prior: 0.2803 log-endog_prior 0.0000 numberOfActiveInequalities 0 log-MDD(Laplace) -1098.5573010 estimation sample is: 2004Q1 : 2017Q3 (55 observations) solution algorithm is: rise_1 estimation algorithm is: fmincon number of estimated parameters is: 23 number of function evaluations is: 10128

start time: 07-Oct-2019 18:11:29 end time : 07-Oct-2019 18:17:40 total time: 0:6:11

List of issues

none

Result 2: MODEL ESTIMATION RESULTS distribution initval mode mode_std

tau BETA 0.62 0.18662 0+17241i
kappa GAMMA 0.32 0.0098385 0+1.4631e-13i alpha BETA 0.13 0.95356 0+3278.2i
rss BETA 0.02 0.0023873 0+95.296i
rhoq BETA 0.36 0.12705 0+2580i
rhoystar BETA 0.95 0.69956 0+17309i
rhoPaistar BETA 0.41 0.01364 0+140.31i
rhoz BETA 0.49 0.74528 0+11680i
rhor BETA 0.89 0.0002951 0+3946.5i
gammay GAMMA 0.97 0.11389 0+18112i
gammaPai GAMMA 3 0.99301 0+1.042e+05i gammae GAMMA 0.001 9.5279e-05 0+7.6502i
sigr INV_GAMMA 0.002 0.0058349 0+160.96i
sigq INV_GAMMA 0.02 0.046963 0+1709.2i
sigystar INV_GAMMA 0.005 0.0029245 0+191.66i
sigPaistar INV_GAMMA 0.005 0.0057202 0+61.089i
sigz INV_GAMMA 0.019 0.0042871 0+515.35i
sigy INV_GAMMA 0.002 0.0060382 0+186.88i
sigPai INV_GAMMA 0.012 0.045556 0+2762.4i
stderr_obspi INV_GAMMA 0.012 0.0041055 0+305.18i
stderr_obspistar INV_GAMMA 0.012 0.0062882 0+45.775i
stderr_obse INV_GAMMA 0.012 2.3369 0+3.9889e+05i stderr_obsq INV_GAMMA 0.012 0.024979 0+374.45i
                distribution    initval       mode         mode_std   
log-post: -1247.7500 log-lik: -1259.3472 log-prior: 11.5973 log-endog_prior 0.0000 numberOfActiveInequalities 0 log-MDD(Laplace) -1377.5057001 estimation sample is: 2004Q1 : 2017Q3 (55 observations) solution algorithm is: rise_1 estimation algorithm is: fmincon number of estimated parameters is: 23 number of function evaluations is: 2068

start time: 07-Oct-2019 18:18:31 end time : 07-Oct-2019 18:19:39 total time: 0:1:9

List of issues

none

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

tonyxia324 commented 5 years ago

Hi Junior,

Thank you for your quick response. Do you think bee_gate would be a more 'robust' optimizer? I'm not really concerned with the speed right now. My main focus is to first find a way to generate consistent results.

Best, Tony

jmaih commented 5 years ago

Hi Tony,

More robust than what ?

Bee_gate has very good exploration capabilities as long as you let it run for a long time. Sometimes it helps to use a derivative-based optimization algorithm to finish off the job after bee_gate has sufficiently explored the landscape.

Cheers, J.

Sendt fra min iPhone

okt. 2019 kl. 13:49 skrev tonyxia324 notifications@github.com:

Hi Junior,

Thank you for your quick response. Do you think bee_gate would be a more 'robust' optimizer? I'm not really concerned with the speed right now. My main focus is to first find a way to generate consistent results.

Best, Tony

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

jkrenz commented 2 years ago

I was wondering what is means that there are imaginary numbers in the mode_std? (as we have similar issues)

jmaih commented 2 years ago

I was wondering what is means that there are imaginary numbers in the mode_std? (as we have similar issues)

If you do not reach a peak in the optimization, you won't get a positive definite hessian. If you do not have a positive definite hessian, you won't have a positive definite covariance matrix. If you do not have a positive definite covariance matrix, you may get negative variances at least for some parameters. In turn, if a variance is negative, then the standard error will be a complex number.

Sometimes the landscape you optimize over is just too rough, with thin ridges and winding paths, unexpected cliffs, etc. Sometimes, some elements of the final parameter vector may lie up against the boundaries of the parameter space, pushing the standard deviation toward zero.

There are various ways to cure the problem for instance by having more informative priors, which naturally comes at the expense of exploring potentially interesting areas of the parameter space. One could also change the optimizer but there is never a garantee that by itself it will yield well-behaved standard errors.

From a Bayesian perspective, the real question is not about whether you have complex standard deviations or not. The real question is what you need the covariance matrix for. In a Bayesian analysis you never present standard errors (as you do in a frequentist analysis), perhaps unless you've computed them from a simulation of the posterior distribution. But then in order to simulate the posterior distribution, some algorithms like the Metropolis-Hastings, required a well-behaved covariance matrix.

The good news is that the theory does not say that the covariance matrix to use in that exercise has to be the covariance matrix you obtain after maximizing the posterior distribution. In fact many bayesians never maximize the posterior distribution before drawing samples from it. They do it the other way around : they sample first and then if needed the (found) mode of the distribution will just be the parameter vector with the highest posterior value.

So if you want to simulate the posterior distribution (by e.g. Metropolis Hastings), strictly speaking, all you need is a good covariance matrix. Other samplers like the slice sampler do not even require a covariance matrix.

jmaih / RISE_toolbox

Different estimation results with the same model #120