Open emitra17 opened 5 years ago
The best reference I've found for this is Givens and Hoeting Chapter 9 of Computational Statistics (2013). They call it "Bootstrapping Regression" (Section 9.2.3).
Briefly how it works: You do the fitting once. Each point has a best-fit estimate y_i_hat and an associated error epsilon_i_hat. Then your resampled data point y_i_resampled = y_i_hat + epsilon_j_hat, where epsilon_j_hat is the error associated with some other random point. You are "resampling the errors".
Our current bootstrapping implementation is non-parametric bootstrapping (the normal kind), which works under the assumption that (x,y) pairs are independently drawn from a distribution.
A colleague suggested adding an option for the "wild bootstrap", which assumes the independent variable is at fixed values, and only the dependent variable is drawn from a distribution.