Open edvinf opened 3 years ago
@edvinf A somewhat similar issue was raised at WKRDB-EST2
In simpler cases inclusion probabilities needed for variance estimation can be calculated from sample size and population size. However, more complex joint inclusion probabilities are required for estimation of variance for some design, e.g., unequal probability designs. These are not currently incorporated into the RDBES format. They take the form of matrices of joint inclusion probabilities for units within a sample and so are not easy to incorporate in the model.
The WKRDB-EST2 discussion went along the lines of proposing that
such complex joint inclusion probabilities are not, for now, incorporated into the RDBES data model. Rather, institutes requiring these more complicated analyses should be suggested to import them into R for the estimation in a separate format, or use other imported information to calculate them, if they are required.
The above is something you can consider in the intersessional work on selection method.
Analysis done at WKRDB-EST2 indicates documentation to be added
We have decided to pass this to the selection method subgroup. It would be ideal but not totally necessary to resolve this before the data call. (In general joint inclusion probabilities will need to be dealt with separately in the estimation stage.)
The algorithm for calculation of joint inclusion probabilities are different for different probabilistic sampling schemes that are not differentiated by the current codes for selectionMethod. Pairwise joint inclusion probabilities are needed for variance estimation with Horvitz-Thompson estimators.
We have previously discussed a related issue when considering if inclusion probabilities could be calculated from selection probabilities or vice versa. We then solved it by including both sampling probabilities, but this is impractical for pairwise joint inclusion probabilities which requires a vector for each sampling unit (or a matrix for a set of sampling units).
Two sampling strategies in use are identified, which have different algorithms for calculation of joint inclusion probabilities:
A possible solution would be to tighten the definition for the probabilistic selection methods so that they are restricted to case 1., and then add corresponding codes that are restricted to case 2.
This could also be considered in the intersessional work WGCATCH, that are currently reviewing non-probabilistic selection methods.