issue with the EC paramaterization

profkenm commented 4 years ago

Hello,

I am very interested in using your ARDL package, however, I am having a problem with it. I must be missing something. I ask kindly please for your assistance.

Please note that my pastes from R and Stata are unreadable, so I have attached this entire message as an attachment with readable output. note on github re problem w ardl function.pdf

When I replicate an ardl model from your ARDL package in Stata, the results are the same in both. However, when I replicate the same model in error correction form, the results are different. In particular, although the model fit, intercept, and error correction terms are the same in both, the long-run and short-run coefficients are different.

Consider the following ardl, run in R and, separately, in Stata. Paramaterized as an ardl, the results are identical.

In R:

> data(denmark)
>

ardl_3132 <- ardl(LRM ~ LRY + IBO + IDE, data = denmark, order = c(3,1,3,2)) summary(ardl_3132)

Time series regression with "zooreg" data: Start = 1974 Q4, End = 1987 Q3

Call: dynlm::dynlm(formula = full_formula, data = data, start = start, end = end)

Residuals: Min 1Q Median 3Q Max -0.029939 -0.008856 -0.002562 0.008190 0.072577

Coefficients: Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.6202 0.5678 4.615 0.00004187 * L(LRM, 1) 0.3192 0.1367 2.336 0.024735
L(LRM, 2) 0.5326 0.1324 4.024 0.000255 L(LRM, 3) -0.2687 0.1021 -2.631 0.012143 *
LRY 0.6728 0.1312 5.129 0.00000832 L(LRY, 1) -0.2574 0.1472 -1.749 0.088146 .
IBO -1.0785 0.3217 -3.353 0.001790 L(IBO, 1) -0.1062 0.5858 -0.181 0.857081
L(IBO, 2) 0.2877 0.5691 0.505 0.616067
L(IBO, 3) -0.9947 0.3925 -2.534 0.015401
IDE 0.1255 0.5545 0.226 0.822161
L(IDE, 1) -0.3280 0.7213 -0.455 0.651847
L(IDE, 2) 1.4079 0.5520 2.550 0.014803 *

Signif. codes: 0 ‘’ 0.001 ‘’ 0.01 ‘’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.0191 on 39 degrees of freedom Multiple R-squared: 0.988, Adjusted R-squared: 0.9843 F-statistic: 266.8 on 12 and 39 DF, p-value: < 0.00000000000000022

In Stata:

. ardl LRM LRY IBO IDE , lags(3,1,3,2)

ARDL(3,1,3,2) regression

Sample: 4 - 55 Number of obs = 52 F( 12, 39) = 266.82 Prob > F = 0.0000 R-squared = 0.9880 Adj R-squared = 0.9843 Log likelihood = 139.51294 Root MSE = 0.0191

     LRM |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]

-------------+---------------------------------------------------------------- LRM	L1.	.3192077 .1366567 2.34 0.025 .0427934 .5956219 L2.	.5326063 .132361 4.02 0.000 .2648809 .8003317 L3.	-.2686663 .1021345 -2.63 0.012 -.4752529 -.0620798
LRY
--.	.6727993 .1311638 5.13 0.000 .4074955 .938103
L1.	-.2574193 .1471752 -1.75 0.088 -.5551092 .0402706
IBO
--.	-1.078518 .3217011 -3.35 0.002 -1.72922 -.4278161
L1.	-.1061973 .5857973 -0.18 0.857 -1.291084 1.07869
L2.	.2876689 .5691013 0.51 0.616 -.8634472 1.438785
L3.	-.9946781 .3925147 -2.53 0.015 -1.788614 -.2007421
IDE
--.	.1254643 .5544522 0.23 0.822 -.9960211 1.24695
L1.	-.3279847 .7213227 -0.45 0.652 -1.786998 1.131028
L2.	1.407857 .5520352 2.55 0.015 .2912608 2.524454
_cons	2.620192 .5677679 4.61 0.000 1.471773 3.768611

However, when using the error correction parameterization of the same model, the model fit is the same in both, and the intercept is the same in both, and the adjustment parameter is the same in both, but the long-run coefficients and short-run coefficients are very different.

EC in R

uecm_3132 <- uecm(ardl_3132, case = 3) summary(uecm_3132)

Time series regression with "zooreg" data: Start = 1974 Q4, End = 1987 Q3

Call: dynlm::dynlm(formula = full_formula, data = data, start = start, end = end)

Residuals: Min 1Q Median 3Q Max -0.029939 -0.008856 -0.002562 0.008190 0.072577

Coefficients: Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.62019 0.56777 4.615 0.00004187 L(LRM, 1) -0.41685 0.09166 -4.548 0.00005154 L(LRY, 1) 0.41538 0.11761 3.532 0.00108 ** L(IBO, 1) -1.89172 0.39111 -4.837 0.00002093 * L(IDE, 1) 1.20534 0.44690 2.697 0.01028
d(L(LRM, 1)) -0.26394 0.10192 -2.590 0.01343
d(L(LRM, 2)) 0.26867 0.10213 2.631 0.01214
d(LRY) 0.67280 0.13116 5.129 0.00000832 d(IBO) -1.07852 0.32170 -3.353 0.00179 * d(L(IBO, 1)) 0.70701 0.46874 1.508 0.13953
d(L(IBO, 2)) 0.99468 0.39251 2.534 0.01540
d(IDE) 0.12546 0.55445 0.226 0.82216
d(L(IDE, 1)) -1.40786 0.55204 -2.550 0.01480 *

Signif. codes: 0 ‘’ 0.001 ‘’ 0.01 ‘’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.0191 on 39 degrees of freedom Multiple R-squared: 0.7458, Adjusted R-squared: 0.6676 F-statistic: 9.537 on 12 and 39 DF, p-value: 0.00000003001

EC in Stata

. ardl LRM LRY IBO IDE , lags(3,1,3,2) ec

ARDL(3,1,3,2) regression

Sample: 4 - 55 Number of obs = 52 R-squared = 0.7458 Adj R-squared = 0.6676 Log likelihood = 139.51294 Root MSE = 0.0191

   D.LRM |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]

-------------+---------------------------------------------------------------- ADJ	LRM	L1.	-.4168524 .0916574 -4.55 0.000 -.6022471 -.2314577 -------------+---------------------------------------------------------------- LR	LRY	.9964676 .123931 8.04 0.000 .7457935 1.247142 IBO	-4.538116 .5202961 -8.72 0.000 -5.590514 -3.485718 IDE	2.89152 .9950853 2.91 0.006 .8787701 4.90427 -------------+---------------------------------------------------------------- SR	LRM	LD.	-.2639399 .1019171 -2.59 0.013 -.4700868 -.0577931 L2D.	.2686663 .1021345 2.63 0.012 .0620798 .4752529
LRY
D1.	.2574193 .1471752 1.75 0.088 -.0402706 .5551092
IBO
D1.	.8132065 .4838924 1.68 0.101 -.1655583 1.791971
LD.	.7070092 .4687392 1.51 0.140 -.2411053 1.655124
L2D.	.9946781 .3925147 2.53 0.015 .2007421 1.788614
IDE
D1.	-1.079873 .565982 -1.91 0.064 -2.224679 .064934
LD.	-1.407857 .5520352 -2.55 0.015 -2.524454 -.2912608

   _cons |   2.620192   .5677679     4.61   0.000     1.471773    3.768611

People in my field have been using the stata version. Any comments or suggestions about what is going on here would be warmly appreciated.

Thank you, and thanks for the ARDL package.

Ken

Natsiopoulos commented 4 years ago

Hi Ken,

Thank you very much for your interest in the ARDL package and the detailed question. I am sorry in advance for the long answer, I just want to be as clear as possible.

In a nutshell:

If you are interested in the EC regression results, the ARDL R package result is the right one.
The Stata results are not nonsense, but they are not what I personally expect when I type "ardl (model specification) ec" in Stata.

What I mean: The same (almost) variable names appear in the Stata and in the R results. The fact that the variables that appear in the EC model are very similar to those in the Stata results (except that in Stata the independent variables appear in levels and not with one lag) makes the user believe that these are the regression results.

We can make sure that the results of the ARDL package are the correct ones if we simply calculate the regression straight ahead. We can extract the model structure as shown below (following your codes): > uecm_3132$full_formula d(LRM) ~ L(LRM, 1) + L(LRY, 1) + L(IBO, 1) + L(IDE, 1) + d(L(LRM, 1)) + d(L(LRM, 2)) + d(LRY) + d(IBO) + d(L(IBO, 1)) + d(L(IBO, 2)) + d(IDE) + d(L(IDE, 1))

And then simply copy and paste it and run the regression: > library(dynlm) > uecm_lm <- dynlm(d(LRM) ~ L(LRM, 1) + L(LRY, 1) + L(IBO, 1) + L(IDE, 1) + d(L(LRM, 1)) + d(L(LRM, 2)) + d(LRY) + d(IBO) + d(L(IBO, 1)) + d(L(IBO, 2)) + d(IDE) + d(L(IDE, 1)), data=denmark)

Now we can check and be sure that the EC model from the ARDL package is the right one: > identical(summary(uecm_3132)$coef, summary(uecm_lm)$coef) [1] TRUE

About the Stata results, I don't know the underlying logic of the algorithm, but from what I see I conclude the followings:

The _cons and the ADJ are coming from the regression as well as the lagged differences of the other variables. (A quick note. The coefficient of the lagged dependent variable (in the unrestricted ECM) happens to be the same with the coefficient of the adjustment factor (in the restricted ECM). But the standard errors in the unrestricted and the restricted EC forms are not the same. So strictly speaking, what appears in Stata as the standard error of the ADJ is the standard error of the lagged dependent variable. And of course, all the associate statistics are wrong too, t, p-value, CI)
The LR (independent variables in levels) are actually correct! They are estimated as functions of the regression coefficients, but they don't appear directly as part of the regression. You can see the mathematical form of the calculations in the help file of the multipliers function (section Mathematical Formula, under the As derived from an Unrestricted ECM) https://www.rdocumentation.org/packages/ARDL/versions/0.1.0/topics/multipliers See the corresponding ECM notation in https://www.rdocumentation.org/packages/ARDL/versions/0.1.0/topics/uecm In R we can get the Long-Run multipliers by: > multipliers(uecm_3132)

term	estimate	std.error	t.statistic	p.value
1	(Intercept)	6.2856579	0.7719160	8.142930	6.107445e-10
2	LRY	0.9964676	0.1239310	8.040503	8.358472e-10
3	IBO	-4.5381160	0.5202961	-8.722180	1.058619e-10
4	IDE	2.8915201	0.9950853	2.905801	6.009239e-03

The differenced and lagged LRM variables (dependent variable) seem to also come from the regression results.
The differences of all the independent variables are indeed the Short-Run multipliers! Again not a direct part of the regression but a function of the estimated parameters. This functionality will be available in the ARDL package in the next update and it will be available through: > multipliers(uecm_3132, type = "sr")

I hope my answer was not too confusing. Thank you once again for your interest, I hope you find the package helpful and intuitive. I would be glad to receive some feedback or suggestions about the package.

Best,

Kleanthis

profkenm commented 4 years ago

Thank you Kleanthis for your thorough explanation. Much appreciated.

Ken

Natsiopoulos / ARDL

issue with the EC paramaterization #1