mpiktas / midasr

R package for mixed frequency time series data analysis.
http://mpiktas.github.io/midasr/
Other
73 stars 34 forks source link

Understanding the forecasts calculated by `select_and_forecast` #37

Closed DataMinerR closed 10 years ago

DataMinerR commented 10 years ago

In the user's guide on the pages 23-24 there is a demonstration of the function select_and_forecast. This function calculates, among other things, forecasts according to supplied specifications. The given example calculates one-step-, two-step- and three-step-ahead out-of-sample forecasts 50 times.

In order to check my understanding, I tried to calculate the forecasts "manually" using the suggested models (every time the first of the suggested models for each horizon).

I manage to get the first values of the forecasts for each horizon:

Preparation of the data for the first forecasts:

yy<-y[1:200]
ttrend<- trend[1:200]
xx<-x[1:800]
zz<-z[1:2400]

Calculate one-step-ahead forecast per hand with cbfc$bestlist[[1]][[1]]:

m<-midas_r(yy ~ ttrend + mls(xx, 4:18, 4, nealmon) + mls(zz, 12:25, 12, nealmon),start=list(xx=rep(1,3),zz=rep(1,3)))
round(forecast(m, newdata = list(xx = rep(NA, 4), zz = rep(NA, 12),ttrend = 201)),8)==round(cbfc$forecasts[[1]]$forecast[1,1],8)
TRUE

Calculate two-step-ahead forecast per hand with cbfc$bestlist[[2]][[1]]:

mm<-midas_r(yy ~ ttrend + mls(xx, 8:21, 4, nealmon) + mls(zz, 24:38, 12, nealmon),start=list(xx=rep(1,3),zz=rep(1,3)))
round(forecast(mm, newdata = list(xx = rep(NA, 4), zz = rep(NA, 12),ttrend = 201)),8)==round(cbfc$forecasts[[2]]$forecast[1,1],8)
TRUE

Calculate three-step-ahead forecast per hand with cbfc$bestlist[[3]][[1]]:

mmm<-midas_r(yy ~ ttrend + mls(xx, 12:25, 4, nealmon) + mls(zz, 36:46, 12, nealmon),start=list(xx=rep(1,3),zz=rep(1,3)))
round(forecast(mmm, newdata = list(xx = rep(NA, 4), zz = rep(NA, 12),ttrend = 201)),8)==round(cbfc$forecasts[[3]]$forecast[1,1],8)
TRUE

Expand the data by one low-frequency period in order to calculate the next forecasts:

yye<-y[1:201]
ttrende<- trend[1:201]
xxe<-x[1:804]
zze<-z[1:2412]

Let's try to calculate the second three-sep-ahead forecast using the expanded data:

mmme<-midas_r(yye ~ ttrende + mls(xxe, 12:25, 4, nealmon) + mls(zze, 36:46, 12, nealmon),start=list(xxe=rep(1,3),zze=rep(1,3)))
forecast(mmme, newdata = list(xxe = rep(NA, 4), zze = rep(NA, 12),ttrende = 202))
21.94055

As can be seen, the result of the last commmand ist: 21.94055 but the function select_and_forecast gives: cbfc$forecasts[[3]]$forecast[2,1] : 21.96188

What am I doing wrong here?

One more question:

In the case of three-step-ahead forecasts, does the function select_and_forecast calculate forecasts for the periods 201-250, or for the periods 203:252, because the first forecast is calculated using the first 200 values of the observations, so that the first three-step-ahead forecast is for the period 203? If the first is true, the forecasting procedure must start by 198, or?

Thank you in advance!

vzemlys commented 10 years ago

What data are you using?

DataMinerR commented 10 years ago

I am using the simulated data from the guide on page 12, just exactly as given in the guide. So I only replicated the results on page 23...

vzemlys commented 10 years ago

Which version of the package are you using? You can see the version with sesionInfo() command on R prompt.

DataMinerR commented 10 years ago

It is: midasr_0.1

vzemlys commented 10 years ago

Ok, will look into that. I suspect that there might be an issue with either the starting values, or the sample used in estimation, note the difference is not that large.

DataMinerR commented 10 years ago

Right, I have noticed it too, that the difference is small. The function updates the parameters every time it generates a new out-of-sample forecast, doesn't it? I also speculated that the function uses the same set of parameters which were estimated at the beginning, as they are shown in the bestlist...

vzemlys commented 10 years ago

Yes the parameters are updated. To be more precise, when forecasting, the parameters from the fitted model are used as starting values, and the model is fitted again with the in-sample provided. I will investigate more.

vzemlys commented 10 years ago

Since you are increasing insample, you are doing the recursive forecast. The default option is fixed forecast, i.e. we use the coefficients of the same insample for all the forecasts. Use ftype="recursive" and you will get the desired behaviour, i.e. the forecasts will match.

vzemlys commented 10 years ago

Please close the issue, if you are able to replicate the behaviour with ftype="recursive".

DataMinerR commented 10 years ago

Great, it works now! Concerning my last question, does the function calculate three-step forecasts for the periods 201-250, or for the periods 203:252?

Thanks a lot!

vzemlys commented 10 years ago

The answer is for the periods 201-250, since otherwise it would not be possible to calculate out-of-sample fit statistics, i.e. there are no values of y available for periods 251:253.