AngusMcLure / PoolPoweR

Power and sample size calculations for surveys using pool testing (AKA group testing)
GNU General Public License v3.0
0 stars 1 forks source link

`optimise(sample_design)` refactor #43

Open fredjaya opened 1 week ago

fredjaya commented 1 week ago

Related to #40.

Refactor to methods

Now that sample_design classes are implemented, refactor the optimise... family of functions to be methods of fixed_design and variable_design. This eliminates the need for the *random* in function names.

Method dispatch

Once refactored to methods, add new subclasses to dispatch according to sample_design attributes that are NULL, i.e.:

Example usage:

fd <- fixed_design(...) # Doesn't matter which pool vars are NULL
optimise_prevalence(fd) # Will dispatch according to NULL pool vars 

Method dispatch will eliminate the individual random/s/sN functions. Adding sample_design subclasses according to NULL vars for dispatching i.e. opt_s, opt_N, opt_sN.

When no NULL vars, do not add class and return message that all values are optimised. This state is ideal for calculating internal attributes total_pools and total_units, else leave NULL.

Return sample_design output

Streamline the optimise() to power()/sample_design() pipeline by amending the output of optimise* funcs to return a sample_design. Focus on doing all of the above and this part on optimise_sN_prevalence() first.

Current output is:

> optimise_sN_prevalence(prevalence = 0.01, cost_unit = 5, cost_pool = 10, cost_cluster = 100, correlation = 0.05)
$s
[1] 5

$cost
[1] 0.2473822

$catch
[1] 20

$N
[1] 4

Change so it outputs:

fixed_design(pool_size = s, pool_number = N, sensitivity = sensitivity, specificity = specificity)

Consider adding a general pretty print method that can be propagated directly to PoolTools. But hold off on this, as the returned fixed_design will be used for downstream power() etc. funcs.

The current PoolTools output for the above is:

For the given inputs, the optimal design is to sample 20 units per collection site, across 4 pools with 5 units each pool.
fredjaya commented 1 week ago

Need to think about how to deal with optimise_prevalence.fixed_sN() (previously optimise_sN_prevalence) when dealing with correlation = NA | 0.

 if (is.na(correlation) || correlation == 0) { 
    opt <- optimise_s_prevalence(
      pool_number = 1, prevalence, cost_unit, cost_pool, cost_cluster,
      correlation = NA, x$sensitivity, x$specificity, max_s, form
    )

    na_or_inf <- ifelse(is.na(correlation), NA, Inf)

    out <- fixed_design(
      pool_size = opt$s,
      pool_number = na_or_inf,
      total_units = na_or_inf,
      sensitivity = x$sensitivity,
      specificity = x$specificity
    )

    return(out)
  }

If correlation = NA then sample_design will need to accept numeric, NULL, AND NA.