Open bschneidr opened 2 years ago
Would also be good to include a vignette on choosing among different replication methods, and choosing the number of bootstrap replicates to use.
For generalized bootstrap based on Beaumont & Patak (2012), basic code is super simple:
gen_boot_factors <- function(B, Sigma) {
n <- nrow(Sigma)
if (!isSymmetric.matrix(Sigma)) {
stop("`Sigma` must be a symmetric matrix.")
}
replicate_factors <- t(
MASS::mvrnorm(n = B,
mu = rep(1, times = n),
Sigma = Sigma,
empirical = FALSE)
)
if (any(replicate_factors < 0)) {
rescaling_constant <- max(1 - replicate_factors)
rescaled_replicate_factors <- (replicate_factors + (rescaling_constant-1))/rescaling_constant
} else {
rescaling_constant <- 1
rescaled_replicate_factors <- replicate_factors
}
attr(rescaled_replicate_factors, 'tau') <- rescaling_constant
return(rescaled_replicate_factors)
}
All of the bootstrap methods are looking good. Test coverage is back to 92%.
Only two issues to iron out:
make_rwyb_bootstrap_weights()
a new argument, such as inclusion_indicator
. It doesn't seem like svydesign()
can accomodate all the information needed for when a stage of the survey has nonresponse. Maybe a better interface would be something like the following:specify_design(
sampling_stage(type = "PPSWOR", prob = "PSU_PROB", id = "PSU_ID", stratum = "FIRST_STAGE_STRATUM"),
sampling_stage(type = "Poisson", prob = "PSU_RESP_PROB", id = "PSU_ID", response_indicator = "IS_RESPONDENT"),
sampling_stage(type = "SRSWOR", prob = "SSU_PROB", id = "SSU_ID", stratum = "SECOND_STAGE_STRATUM")
)
make_rwyb_bootstrap_weights()
works correctly for three or more stages. Need to go through the Beaumont & Emond (2022) paper and make sure that the section on multistage sampling (three or more stages) is being implemented correctly in the package.Some future updates that would be nice for the bootstrap methods:
Some other replication methods worth supporting:
Some ideas for replicate designs to support, not already supported by the 'survey' package.
To create these, could use an interface such as the following:
This way, if the user wants more details on a specific replication method, they can look at a function specific to that method (e.g., by calling
?successive_differences()
or?pseudo_pop_boot()
). The actual replicate creation could be handled by helpers such ascreate_success_differences_reps()
.