stan-dev / rstan

RStan, the R interface to Stan
https://mc-stan.org
1.02k stars 264 forks source link

problem in fitting SV models using vb in rstan #1075

Closed cabantovalle closed 1 year ago

cabantovalle commented 1 year ago

[edit: formatted code, removed lines of #]

I am trying to fit the basic SV model using the vb function in rstan. I simulate a SV model using an r function

simsv<-function(mu,phi,sigma,gdim){

x=array(0,dim=gdim) 
y=array(0,dim=gdim) 

x[1]=mu+(sigma/sqrt(1-phi^2))*rnorm(1)
y[1]=exp(0.5*x[1])*rnorm(1)
for (j in 2:gdim){
x[j]=mu+phi*(x[j-1]-mu)+sigma*rnorm(1)
y[j]=exp(0.5*x[j])*rnorm(1)
}

  return (list(y=y,x=x))
}

simsv1=simsv(mu=0.005,phi=0.98,sigma=0.2,gdim=500)

This is stan model saved as sv.stan

data {
  int<lower=0> t_max;   // # time points (equally spaced)
  vector[t_max] y;      // mean corrected return at time t
}
parameters {
  real mu;                     // mean log volatility
  real<lower=-1,upper=1> phi;  // persistence of volatility
  real<lower=0> sigma;         // white noise shock scale
  vector[t_max] x;                 // log volatility at time t
}
model {
  mu ~ normal(0, 10);
  sigma ~ inv_gamma(5, 0.05);
  .5*(1+phi) ~ beta(20.0, 1.5);

  x[1] ~ normal(mu, sigma / sqrt(1 - phi * phi));
  for (t in 2:t_max)
    x[t] ~ normal(mu + phi * (x[t - 1] -  mu), sigma);
  for (t in 1:t_max)
    y[t] ~ normal(0, exp(x[t] / 2));
}

Then this is the fit using rstan

require(rstan)
modsv <- stan_model("sv.stan")
t_max=500
t_max=500
svdata<- list(t_max=t_max, y = simsv1$y)
fit_vb_sv <-vb(modsv,data = svdata,iter=10000)

mean(extract(fit_vb_sv,"mu")$mu)
mean(extract(fit_vb_sv,"phi")$phi)
mean(extract(fit_vb_sv,"sigma")$sigma)

As said before the true values for mu=0.005, phi=0.98 and sigma=0.2 and the following output:

Warning: Pareto k diagnostic value is 1.7. Resampling is disabled. Decreasing tol_rel_obj may help if variational algorithm has terminated prematurely. Otherwise consider using sampling instead.
> mean(extract(fit_vb_sv,"mu")$mu)
[1] -0.03620665
> mean(extract(fit_vb_sv,"phi")$phi)
[1] 0.1489165
> mean(extract(fit_vb_sv,"sigma")$sigma)
[1] 1.089539

Any idea why the code does not give me the correct results?

Best regards,

Carlos Abanto

bob-carpenter commented 1 year ago

There are two potential problems.

The k-hat value of 1.7 is telling you that the normal approximation is not a good fit to your posterior.

Two things typically go wrong:

  1. The posterior is not roughly normal and the variational approximation is poor. This is what Pareto k being 1.7 is telling you. This can be because of high correlation in the case of a "mean field" (diagonal) approximation.

  2. The optimization doesn't find the best variational fit.

You can reduce step size and increase number of steps to test if (2) is the problem.

This isn't a bug report or issue, so I'm going to close this issue.

SteveBronder commented 1 year ago

@cabantovalle for this type of help I'd suggest posting on the Stan discourse

https://discourse.mc-stan.org/