mpiktas / midasr

R package for mixed frequency time series data analysis.
http://mpiktas.github.io/midasr/
Other
73 stars 34 forks source link

Expand Stochastic Optimization Methods #53

Closed jarser closed 1 year ago

jarser commented 7 years ago

Simulated Annealing or Metropolis derived methods shouldn't be too hard to implement. I'm aware you can use Ofunction = "optim", method = "SANN" but in my testing that yields a radically different result every time I run it due a lack of customization options. Would you consider expanding SO in the future?

vzemlys commented 7 years ago

Are you aware that you can pass all the customisation options for the optim in call to midas_r ?

jarser commented 7 years ago

Are you aware that you can pass all the customisation options for the optim in call to midas_r?

Sure!

But I mean that there's only one specific stochastic optimization algorithm available compared to a plethora of deterministic methods. For example, GenSA could yield much better results compared to Ofunction = "optim", method = "SANN" and allows for more customization (paper).

Considering that most MIDAS regression problems involve very few parameters, implementing a robust stochastic optimization algorithm with lots of simulations should imply very little cost in terms of computation time in this context.

vzemlys commented 7 years ago

The midasr package is written in a way that you can use any optimisation method you want. Here is the example of how to use GenSA.

library(midasr)
library(GenSA)
y <- midas_sim(500, xx, theta0)
 x <- window(xx, start=start(y))

#We need some sort of fit to generate necessary data
 mr <- midas_r(y~fmls(x,4*12-1,12,theta_h0)-1, list(y=y,x=x), start=list(x=c(-0.1,10,-10,-10))) 

#Now use GenSA
res <- GenSA(mr$start_opt, mr$fn0, lower=c(-Inf,-Inf,-Inf),upper=c(Inf,Inf,Inf))

#Compare the fits

mr$fn0(coef(mr))

mr$fn0(res$par)

Function midas_r returns the optimisation function of MIDAS non-linear optimisation problem. Then you can use it with any optimisation package you want. If you look at the midas_r.fit code you can see that it is very easy to add additional optimisation functions.

I could add an option for GenSA in the code, but then I have the question. GenSA has a mandatory arguments lower and upper. MIDAS is formulated as unconstrained optimisation problem. So the question is whether supplying by default rep(-Inf,length(coef(mr))))`` andrep(Inf,length(coef(mr)))` would be acceptable or not?

jarser commented 7 years ago

I could add an option for GenSA in the code, but then I have the question. GenSA has a mandatory arguments lower and upper. MIDAS is formulated as unconstrained optimisation problem. So the question is whether supplying by default rep(-Inf,length(coef(mr))))` andrep(Inf,length(coef(mr))) would be acceptable or not?

Sorry for the long delay in answering this.

I have been testing a variation of the code above and in general if I set rep(-Inf,length(coef(mr)))) and rep(Inf,length(coef(mr))) as lower and upper limits, GenSA needs an absurd number of iterations in order to produce meaningful results. However, if you have no idea about what type of restriction or initial values to use, and you can provide some "reasonable" but really ample bounds and number of iterations, you can quickly zero-in to some really nice values.

See example below:

remove(list = ls(all = TRUE))

library(midasr)
library(GenSA)

##The parameter function
theta_h0 <- function(p, dk) {
  i <- (1:dk-1)/100
  pol <- p[3]*i + p[4]*i^2
  (p[1] + p[2]*i)*exp(pol)
}

##Generate coefficients
theta0 <- theta_h0(c(-0.1,10,-10,-10),4*12)

##Plot the coefficients
plot(theta0)

##Generate the predictor variable, leave 4 low frequency lags of data for burn-in.
xx <- ts(arima.sim(model = list(ar = 0.6), 600 * 12), frequency = 12)

##Simulate the response variable
# y <- midas_sim(500, xx, theta0)

# x <- window(xx, start=start(y))
# midas_r(y ~ mls(y, 1, 1) + fmls(x, 4*12-1, 12, theta_h0), start = list(x = c(-0.1, 10, -10, -10)))

y <- midas_sim(500, xx, theta0)
x <- window(xx, start=start(y))

#We need some sort of fit to generate necessary data
mr <- midas_r(y~fmls(x,4*12-1,12,nbeta)-1, list(y=y,x=x), start=list(x=c(-1,1,1))) 

#Now use GenSA
res <- GenSA(mr$start_opt, mr$fn0, lower=c(-50,-50,-50), upper=c(50,50,50), control=list(maxit=10000))

#Compare the fits

mr$fn0(coef(mr))
mr$fn0(res$par)

mr$coefficients
res$par

sv_GenSA <- list(x=res$par)
mr1 <- midas_r(y~fmls(x,4*12-1,12,nbeta)-1, list(y=y,x=x), start=sv_GenSA)
summary(mr)
summary(mr1)

plot(mr$midas_coefficients)
plot(mr1$midas_coefficients)

hAh_test(mr)
hAh_test(mr1)

hAhr_test(mr)
hAhr_test(mr1)

Note that if you replace start=list(x=c(-1,1,1))) with start=list(x=c(1,1,1))), you should be able to come up with the same results of GenSA using BFGS. But in general, the deterministic algorithm won't be able to circumvent some really bad starting points.