Closed fredjaya closed 7 months ago
Angus could you please update/add to the above for function documentation? Hopefully i'm not too far off here...
Adding a reminder to avoid using @import/@importFrom
Hi Fred, I'll do most of the editing on the first post, but just pointing out some things here.
The kind of pool-testing setup is usually done when prevalence is low. Though there's no hard limit, I would keep the prevalence <10% in all the examples and good example value might be 1%.
Similarly cluster surveys only make sense if the clustering is fairly low. Again 0.2 is probably a high correlation and 0.75 is very very high. I nice default might be 0.05. Even this is perhaps high, but I actually need to do some more theoretical/empirical work to figure out anything like a reasonable range for MX surveys. Though I don't think we should have a default value for now, users might need some guidance as to what value to put in here if they have no idea, and that's on my long-term to-do list. It won't be easy as it will probably require looking at lots of different datasets from different countries.
optimise_s_prevalence(prevalence = 0.01, cost_unit = 5, cost_pool = 10)
optimise_sN_prevalence(prevalence = 0.01, cost_unit = 5, cost_pool = 10, cost_cluster = 100, correlation = 0.05)
Also, make sure that if a function has support for both cluster and non-cluster surveys that you give an example for each. Usually this can be achieved by providing/not providing an input for correlation I think
I've edited the main post with some rewords/expansion here and there. Didn't take long, so they were pretty close!
Awesome, thanks!
Based on this, i'm thinking of moving design_effect
to its own .R file - any objections?
Sounds fine to me. I take it the idea is to keep the docs nice and clean
Upcoming push has updated unit tests based on these examples. Separate issue for further examples created.
optimise_prevalence.R
Title: Optimising the pool size and number for estimating prevalence.
Description These functions determine cost-effective pooling strategies for estimating the prevalence of a marker in a population. Both functions attempt to choose survey designs that maximise the Fisher Information for given cost or effort.
optimise_s_prevalence()
calculates the optimal single pool size that balances the cost and accuracy for given the marker prevalence, test sensitivity, and specificity, and works for simple random surveys or cluster surveys.optimise_sN_prevalence
also attempts to identify the optimal number of pools per cluster (cluster-surveys only).Examples
fisher_information.R
Title: Calculate the Fisher Information of a pooled-survey design for estimating population prevalence.
Description
fi_pool
andfi_pool_cluster
calculates the Fisher Information for pool testing strategies for a given number and size of pools, where the sensitivity and specificity of the test are known.fi_pool
calculates the Fisher information for the prevalence for simple random surveys.fi_pool_cluster
calculates the two-by-two Fisher information matrix for prevalence and within-cluster correlation for cluster survey designs.Examples
design_effect()
Title: Calculate the design effect for pooled testing.
Description This function calculates the design effect (D) for survey designs using pool testing compared to a simple random survey with individual tests of the same number of units. This allows the comparison of the Fisher Information per unit sampled across different pooling and sampling strategies. A design effect
D>1
(D<1
) indicates that the pooling/sampling strategy reduces (increases) the Fisher information per unit; the total sample size will have to be multiplied by a factor of D to achieve the same degree of precision in estimating prevalence as a simple random survey with individual tests. Supports both cluster and simple random sampling with perfect or imperfect tests.Examples