Closed haikolietz closed 10 years ago
Sorry, this issue came when I was away and slipped down my list.
Regarding your question. Intervals 1 & 2 are focusing on the sampling variability of the mean, whereas the intervals 3 & 4 are focusing on the overall distribution. There isn't a "correct one". It just depends on what you want to show.
I should mention that I only provided the uncertainty regions to give users and idea of how many bootstraps were needed. I'm not sure I would directly report them in a manuscript.
I was thinking about using the uncertainty from bootstrapping as an alternative to the deterministic way of getting the standard error for the estimated alpha (formulas 3.2 and 3.6 in http://arxiv.org/pdf/0706.1062.pdf). Why would you rather not do that? And if I stick to the deterministic way, do you know how I can call the generalized zeta function in R?
The formulas 3.2 and 3.6 in the paper relate to the case for a given xmin
. The bootstrap procedure takes into account the uncertainty in both alpha
and xmin
.
The VGAM
package has a zeta function.
I've noticed that the estimated alpha is often above the mean or median of the bootsrapped alphas. Why is that? Because drawing from the original data with replacement tends to miss the extreme events?
Also, there are several ways to present the uncertainty of alpha after the bootstrapping. For example, for one of my distributions with alpha = 2.64 (P>0.1), there are the following options for 1000 bootstraps:
mean and standard error: 2.27+-0.00 mean and 95% confidence interval: 2.27+-0.01 mean and standard deviation: 2.27+-0.12 median and percentiles [0.025,0.975]: 2.25 [2.15,2.63]
Is there a standard or should one be preferred for statistical reasons? If you ask me, it's a judgement call...