AngusMcLure / PoolPoweR

Power and sample size calculations for surveys using pool testing (AKA group testing)
GNU General Public License v3.0
0 stars 1 forks source link

Input argument definitions #10

Closed fredjaya closed 7 months ago

fredjaya commented 8 months ago

Reference issue for documenting all input arguments in fisher_information.R and optimise_prevalence.R

Argument name Greek/Math notation (local) Definition Accepted inputs
pool_size s The number of units per pool. Must be a numeric value greater than or equal to 0. int >= 0
pool_number N The number of pools per cluster. Must be a numeric value greater than or equal to 0. int >= 0
prevalence theta The proportion of units that carry the marker of interest (i.e. true positive). Must be be a numeric value between 0 and 1, inclusive of both. num [0-1]
cost_unit c_u The cost to process a single unit. Must be a numeric value greater than or equal to 0. num >=0
cost_pool c_p The cost to process a single pool. Must be a numeric value greater than or equal to 0. num >=0
cost_cluster c_c The cost to process a cluster. Must be a numeric value greater than or equal to 0. [For optimise_s_prevalence only, ignored if correlation is NA] num >=0 or NULL/NA?
correlation rho The correlation between test results within a single cluster (units in different clusters are assumed to be uncorrelated). Must be a numeric value between 0 and 1, inclusive of both. A value of 1 indicates that units within clusters are perfectly correlated (there are no differences units within a single cluster). A value of 0 indicates that units within clusters are no more correlated than units in different clusters. [optimise_s_prevalence only: If NA, assumes that survey uses simple random sample and not cluster sampling] num [0-1] or NA
sensitivity varphi The probability that the test correctly identifies a true positive. Must be a numeric value between 0 and 1, inclusive of both. A value of 1 indicates that the test can perfectly identify all true positives. num [0-1]
specificity psi The probability that the test correctly identifies a true negative. Must be a numeric value between 0 and 1, inclusive of both. A value of 1 indicates that the test can perfectly identify all true negatives. num [0-1]
form Form of the distribution used to model the cluster-level prevalence and correlation of units within cluster. [For optimise_s_prevalence only, ignored if correlation is NA]. See details. c(beta, logitnorm, clolognorm)
real_scale (Ignored unless form %in% c(logitnorm, cloglognorm)) Should Fisher information be returned for the parameters of the logitnorm/cloglognorm distributions on the real scale (i.e. mu and sigma)? If FALSE (the default) Fisher information is returned for prevalence (theta) and correlation (rho) instead. TRUE or FALSE
max.s The maximum number of units per pool (pool size). int >= 1
max.N The maximum number of pools per cluster (pool number). int >= 1
interval Range of near-optimal designs to consider. If interval == 0 (the default) only returns optimal design. If interval > 0, function identifies range of designs with cost less than the optimal cost * (1 + interval). numeric >= 0

Some questions and notes:

AngusMcLure commented 8 months ago

The table looks good! 've updated the above comment rather than reproduce the somewhat edited table in a response.

fredjaya commented 8 months ago

Thanks, super helpful!

fredjaya commented 7 months ago

Implemented and documented as of 42d4611.

Move renaming interval to separate issue #13