Input argument definitions

fredjaya commented 8 months ago

Reference issue for documenting all input arguments in fisher_information.R and optimise_prevalence.R

Argument name	Greek/Math notation (local)	Definition	Accepted inputs
pool_size	s	The number of units per pool. Must be a numeric value greater than or equal to 0.	int >= 0
pool_number	N	The number of pools per cluster. Must be a numeric value greater than or equal to 0.	int >= 0
prevalence	theta	The proportion of units that carry the marker of interest (i.e. true positive). Must be be a numeric value between 0 and 1, inclusive of both.	num [0-1]
cost_unit	c_u	The cost to process a single unit. Must be a numeric value greater than or equal to 0.	num >=0
cost_pool	c_p	The cost to process a single pool. Must be a numeric value greater than or equal to 0.	num >=0
cost_cluster	c_c	The cost to process a cluster. Must be a numeric value greater than or equal to 0. [For optimise_s_prevalence only, ignored if correlation is NA]	num >=0 or NULL/NA?
correlation	rho	The correlation between test results within a single cluster (units in different clusters are assumed to be uncorrelated). Must be a numeric value between 0 and 1, inclusive of both. A value of `1` indicates that units within clusters are perfectly correlated (there are no differences units within a single cluster). A value of `0` indicates that units within clusters are no more correlated than units in different clusters. [optimise_s_prevalence only: If NA, assumes that survey uses simple random sample and not cluster sampling]	num [0-1] or NA
sensitivity	varphi	The probability that the test correctly identifies a true positive. Must be a numeric value between 0 and 1, inclusive of both. A value of `1` indicates that the test can perfectly identify all true positives.	num [0-1]
specificity	psi	The probability that the test correctly identifies a true negative. Must be a numeric value between 0 and 1, inclusive of both. A value of `1` indicates that the test can perfectly identify all true negatives.	num [0-1]
form		Form of the distribution used to model the cluster-level prevalence and correlation of units within cluster. [For optimise_s_prevalence only, ignored if correlation is NA]. See details.	c(beta, logitnorm, clolognorm)
real_scale		(Ignored unless form %in% c(logitnorm, cloglognorm)) Should Fisher information be returned for the parameters of the logitnorm/cloglognorm distributions on the real scale (i.e. mu and sigma)? If FALSE (the default) Fisher information is returned for prevalence (theta) and correlation (rho) instead.	TRUE or FALSE
max.s		The maximum number of units per pool (pool size).	int >= 1
max.N		The maximum number of pools per cluster (pool number).	int >= 1
interval		Range of near-optimal designs to consider. If interval == 0 (the default) only returns optimal design. If interval > 0, function identifies range of designs with cost less than the optimal cost * (1 + interval).	numeric >= 0

Some questions and notes:

~is N the total number of pools across all clusters, or the number of pools per cluster?~
~^ same with rho~
max.s, max.N, interval to be renamed
~Angus could you please provide descriptions for: form, "interval", real_scale~
Any comments or things to amend in the definition or accepted inputs?

AngusMcLure commented 8 months ago

The table looks good! 've updated the above comment rather than reproduce the somewhat edited table in a response.

fredjaya commented 8 months ago

Thanks, super helpful!

fredjaya commented 7 months ago

Implemented and documented as of 42d4611.

Move renaming interval to separate issue #13

AngusMcLure / PoolPoweR

Input argument definitions #10