Add `simulate_scenarios()` to easily calculate outbreak size distribution

joshwlambert commented 3 months ago

This PR adds the simulate_scenarios() function which enables the exploration of parameter space of $R$ and $k$ values for different offspring distributions and bins the resulting transmission chain statistics into user-specified intervals to enable them to be easily plotted.

Note: This is a work-in-progress feature (hence the draft pull request). Before merging we need to:

[ ] generalise the ability to pass different distribution parameters to simulate_chain_stats()
[ ] consider if more input checking is required (I think if we can align the function signatures of simulate_scenarios() with simulate_chain_stats() then most, if not all, of the input checking can be deferred to that function).

codecov-commenter commented 3 months ago

Codecov Report

Attention: Patch coverage is 98.30508% with 1 line in your changes missing coverage. Please review.

Project coverage is 99.87%. Comparing base (dc53c2c) to head (9348ed7). Report is 7 commits behind head on main.

Files with missing lines	Patch %	Lines
R/simulate.R	98.30%	1 Missing :warning:

Additional details and impacted files

```diff @@ Coverage Diff @@ ## main #275 +/- ## =========================================== - Coverage 100.00% 99.87% -0.13% =========================================== Files 8 8 Lines 755 814 +59 =========================================== + Hits 755 813 +58 - Misses 0 1 +1 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

sbfnk commented 2 months ago

Thanks @joshwlambert, this looks like useful functionality!

I think there's potentially a broader question (to which I don't know the answer) of whether this kind of functionality should be offered by a package as it's essentially a loop over package function calls. In {socialmixr} we made a decision the other way round, i.e. to remove the ability to create replicates: https://github.com/epiforecasts/socialmixr/pull/63

joshwlambert commented 2 months ago

I think there's potentially a broader question (to which I don't know the answer) of whether this kind of functionality should be offered by a package as it's essentially a loop over package function calls. In {socialmixr} we made a decision the other way round, i.e. to remove the ability to create replicates: https://github.com/epiforecasts/socialmixr/pull/63

I think before continuing development of this function it would be good to decide if we want this functionality as a function, or via documentation. I think the code in the function body could be used to write a vignette demonstrating how to simulate scenarios and then group them. This also relates to @jamesmbaazam's point:

This only fulfills a single use case and we may want to not do this here and leave it to the user. This function could just be about simulating the outbreak over a grid of scenarios and returning the results for downstream analyses/post-processing.

To me half the utility of this function comes from it wrapping cut() instead of requiring the user to know how to use it. But at the same time, a vignette is likely easier to maintain and might be more flexible to a variety of use cases we haven't foreseen.

As I'm just contributing, I'm happy to go with what you both (@jamesmbaazam & @sbfnk) prefer and then I'm happy to work on it.

joshwlambert commented 1 month ago

I think this function should be converted into a vignette. @jamesmbaazam & @sbfnk please let me know if you agree and how you'd like to proceed. I'm happy to draft the vignette and then can assign you both to review, edit and then merge if deemed suitable.

sbfnk commented 1 month ago

Thanks @joshwlambert, that sounds like a good idea to me.

jamesmbaazam commented 1 month ago

@joshwlambert I believe it's a good idea, but I would recommend pivoting the focus to highlight the application of epichains for estimating outbreak sizes based on offspring distribution scenarios. This way, the emphasis will be on the application rather than the function itself. The multi-simulation pattern is already outlined in the COVID-19 vignette, so we can build on that pattern as an application. We can in fact combine the idea here with that outlined in #77. The same simulation can be used to achieve both outcomes.

Let me know what you think.

joshwlambert commented 1 month ago

I would recommend pivoting the focus to highlight the application of epichains for estimating outbreak sizes based on offspring distribution scenarios. This way, the emphasis will be on the application rather than the function itself.

Yes, I agree, I plan to remove the simulate_scenarios() functions and instead use the code from the function body as a script to explain to the reader how to explore distributions of outbreak size and length.

I will make the changes in a new branch and open a new PR and then this PR can be closed without merging.

jamesmbaazam commented 1 month ago

I would recommend pivoting the focus to highlight the application of epichains for estimating outbreak sizes based on offspring distribution scenarios. This way, the emphasis will be on the application rather than the function itself.

Yes, I agree, I plan to remove the simulate_scenarios() functions and instead use the code from the function body as a script to explain to the reader how to explore distributions of outbreak size and length.

I will make the changes in a new branch and open a new PR and then this PR can be closed without merging.

Perfect!

epiverse-trace / epichains

Add `simulate_scenarios()` to easily calculate outbreak size distribution #275

Codecov Report