SMAC-Group / gmwm

Generalized Method of Wavelet Moments (GMWM) is an estimation technique for the parameters of time series models. It uses the wavelet variance in a moment matching approach that makes it particularly suitable for the estimation of certain state-space models.
Other
28 stars 14 forks source link

Robust WV CI #91

Open stephaneguerrier opened 9 years ago

stephaneguerrier commented 9 years ago

The CI for the WV are off. If the efficiency is 60% the robust CI at the 95% confidence level should be about 30% (100_( diff(c(-2,2)_sqrt(1/0.6))/diff(c(-2,2)) - 1)) wider then the classical. In our code, it is about 2 times wider for the first scale... Example:

x = rnorm(1000) wv1 = wvar(x) wv2 = wvar(x, robust = T) (wv2$ci_high - wv2$ci_low)/(wv1$ci_high - wv1$ci_low) [1] 2.019610 1.788926 1.540864 1.394738 1.257274 1.232275 1.156168 1.320823 1.350617

coatless commented 9 years ago

This is a theory issue that @robertomolinari will need to address as the current implementation was built during the 24 hour stretch for the ASA competition.

This is also a duplicate of #67...

So, I'm closing this out. Please keep everything in that issue.

stephaneguerrier commented 9 years ago

There is no theory issue here. This should be fixed quickly. Here is a simple code that should do the trick:

length.CI = function(mu, down, up, alpha = 0.05, eff = 0.6){
    q1 = qnorm(1-alpha/2)
    dist.up = up - mu
    dist.dw = mu - down 
    coef = diff(c(-q1,q1)*sqrt(1/eff))/diff(c(-q1,q1)) 
    down.new = mu - dist.dw*coef
    up.new = mu + dist.up*coef
    c(down.new, up.new)
}

Example:

x = rnorm(100)
wv = wvar(x)
wv
     Variance      Low CI    High CI
2  0.44268253 0.309408409  0.6857198
4  0.18659132 0.114018232  0.3596781
8  0.07397425 0.037704516  0.2056720
16 0.04126040 0.016425970  0.2301369
32 0.02590542 0.007252052  0.8143190
64 0.02163869 0.004307161 22.0337713
length.CI(0.44268253, 0.309408409, 0.6857198)
[1] 0.2706264 0.7564423
coatless commented 9 years ago

This seems like a hack that doesn't address the underlying issue with the code. From #67, the code that is problematic is:

eff = sqrt(sqrt(eff));
double eta3 = std::max(dims(i)/pow(2,i+1),1.0);
out(i,1) = eff * eta3 * y(i)/(R::qchisq(1-alpha_ov_2, eta3, 1, 0)); // Lower CI
out(i,2) = eta3 * y(i)/(eff*R::qchisq(alpha_ov_2, eta3, 1, 0)); // Upper CI
arma::vec wav_coef = sort(signal_modwt_bw(i));
y(i) = sig_rob_bw(wav_coef, eff);

Sorting the elements in the brickwall modwt is done by ascending values e.g. A to Z or from 0 to 9 not descending e.g. Z to A or from 9 to 0

sig_rob_bw obtains the robust WV at that specific point. (These match Roberto's supplied code).

coatless commented 9 years ago

See 519307b for implementation.

coatless commented 9 years ago

Test code:


length.CI = function(wv_robust, wv_class, down, up, alpha = 0.05, eff = 0.6){
  q1 = qnorm(1-alpha/2)
  dist.up = (up - wv_class)/wv_class
  dist.dw = (wv_class - down)/wv_class 
  coef = diff(c(-q1,q1)*sqrt(1/eff))/diff(c(-q1,q1)) 
  down.new = wv_robust - dist.dw*coef*wv_robust
  up.new = wv_robust + dist.up*coef*wv_robust
  c(down.new, up.new)
}

adj.ci = function(wvar.cl, wvar.rob){

  low = wvar.cl$ci_low
  high = wvar.cl$ci_high

  mu = wvar.rob$variance
  mu2 = wvar.cl$variance
  alpha = wvar.rob$alpha
  eff = wvar.rob$eff

  n = length(mu)

  low_rob = rep(NA,n)
  high_rob = low_rob

  for (i in 1:n){
    inter = length.CI(mu[i], mu2[i], low[i], high[i], alpha = alpha, eff = eff)

    if (inter[1] < 0){
      low_rob[i] = mean(c(0,low[i]))
    }else{
      low_rob[i] = inter[1]
    }

    high_rob[i] = inter[2]
  }
  out = list(low = low_rob, high = high_rob)
  out
}

N1 = 1000

data.ar = gen.gts(AR1(phi = .92, sigma2=0.01), N1)

alpha= 0.05
data.ar$data[sample(1:N1,round(N1*alpha))] = rnorm(round(N1*alpha),0,0.78)

wvar1 = wvar(data.ar)

wvar2 = wvar(data.ar, robust = T)

wvar3 = wvar2

interval = adj.ci(wvar1,wvar2)
wvar3$ci_low = interval$low
wvar3$ci_high = interval$high
compare.wvar(wvar1,wvar3, split = F)
compare.wvar(wvar1,wvar2, split = F)

wvar3
wvar2
coatless commented 8 years ago

Changed:

lci = wv_ri - lci*coef*wv_ri;
if (inter[1] < 0){
  low_rob[i] = lci
} else {
  low_rob[i] = 2.2204460492503131e-16 # no longer uses wv_ci_low / 2
}

This problem should fix the point outside of the Robust WV.

Change approved by @robertomolinari .

stephaneguerrier commented 8 years ago

Hi James,

It might be a bad idea to use low_rob[I] = 0 as we use a log-log plot to represent the data… What about low_rob[I] = wvar[I]/2? Any other ideas?

Best,

Stef

coatless commented 8 years ago

0 in this case is the smallest double possible DBL_EPSILON = 2.2204460492503131e-16 (e.g. how we code zero for parameter estimates to avoid initialization issues ).

Zeroing under this method encloses the robust WV estimate (unlike previously).

coatless commented 8 years ago

Robust WV is still not ideal under above change as it elongates the graph (repro by @Wenchao-Yang ):

library("gmwm")
data(imu6, package="imudata")
test1 = imu(imu6, gyros = 1:3, accels = 4:6, axis = c('X', 'Y', 'Z'), freq = 100)
wv1 = wvar(test1)

test2 = imu(imu6, gyros = 1, accels = 3:4, axis = c('X','X','Y'), freq = 100)
wv2 = wvar(test2, robust = T)
compare.wvar(wv1, wv2)

screen shot 2016-05-19 at 3 40 06 pm

Possible fix:

Hide the last scale on the graph? (Apparently it is customary in WV literature to only show J-1 scales. CC: @robertomolinari )

coatless commented 8 years ago

How would you folks feel if we brought back:

double eff_mod = sqrt(sqrt(eff));
double eta3 = std::max(dims(i)/pow(2,i+1),1.0); 
double lci = eff_mod * eta3 * wv/(R::qchisq(1-alpha_ov_2, eta3, 1, 0)); // Lower CI

If the LCI is less than 0?

The proposal is basically to bring back the original modified chi-squared intervals for that LCI < 0.

coatless commented 8 years ago

New approach: 755079b

Note: We aim to remove the last scale from estimation in a future build. Hence, there will be a J-1 scales...