Open Pipe-Vash opened 2 years ago
I got roughly the same coverage. I came here thinking I have done something wrong.
I agree with the comments above.
@Pipe-Vash If you want to verify that you get the correct asymptotic coverage without having to generate millions of samples ($N \approx 10^6$), you can sample $Y$ from a Gaussian with smaller variance $\sigma^2 < 1$. Then you check for coverage by testing whether the bootstrap intervals contain the true skewness of a general $\text{LogNormal}(\mu=0, \sigma^2)$.
1- I don't think that the "fourth method" that the exercise ask for is the Parametric bootstrap. but the "The Jacknife" Confidence Interval that is explained in the Appendix of the chapter.
2- The second part of the exercise ask for "the true coverage", which means the "the percentage in which the true value of the statistic" (in this case the Skewness parameter). My way to solve this is the following (excuse me if my code is not very fluent)
According to theory:
So that the skewness parameter of $X=e^Y$ is $\theta=T(F)=\int (x-\mu)^{3} {\rm d}F(x) /\sigma_X^3=\frac{E[X-E(X)]^3}{\left(E[X-E[X)]^2\right)^{3/2}}=\frac{e^{3/2} (2-3 e +e^3)}{e^{3/2}(e-1)^{3/2}}= \frac{(2-3 e +e^3)}{(e-1)^{3/2}} \approx 6.18$. This will be the value we expect to be within our confidences intervals (not only for each method, but also for each simulated random vector). The "true coverage" estimations continues as follow:
The output of the simulation: This means that our 50 datapoints are not enough to make our bootstrap 95% confidence intervals to reach the expected coverage...
With $n=1000$ datapoints the coverage gets better:
With $n=100000$ datapoints the coverage gets closer to the expected 95%,
Assuming that my procedure is correct, I only need more data points (which is more computationally expensive, reason of which I do not continue reporting results)...
I would appreciate some feedback to compare for.