epiverse-trace / simulist

An R package for simulating line lists
https://epiverse-trace.github.io/simulist/
Other
4 stars 0 forks source link

can we use `as.function()` to convert object to the `generate()` instead of `density()`? #107

Closed avallecam closed 2 months ago

avallecam commented 2 months ago

I just noticed that {simulist} internally uses as.function() to convert it to the density() function. Is this functionality only possibly to generate the density function or also to the quantile(), thinking in {EpiNow2}, or generate(), thinking in {epichains}? (related comment https://github.com/epiverse-trace/tutorials-middle/pull/29#discussion_r1559310192)

Also, given the internal usage of as.function(), {simulist} looks strongly dependent on {epiparameter}, while compared to {cfr}, this connection is optional. Is this dependency desirable to keep for ecosystem interoperability?

possibly already discussed in the past, so happy to read any previous discussion on this regard.

here is a reference reprex of the {cfr} optional interface for epiparameter using as.function()

library(epiparameter)
#> Warning: package 'epiparameter' was built under R version 4.3.3
library(cfr)

# Load the Ebola 1976 data provided with the {cfr} package
data("ebola1976")

# Get delay distribution
onset_to_death_ebola <-
  epiparameter::epidist_db(
    disease = "Ebola",
    epi_dist = "onset_to_death",
    single_epidist = TRUE
  )
#> Using WHO Ebola Response Team, Agua-Agum J, Ariyarajah A, Aylward B, Blake I,
#> Brennan R, Cori A, Donnelly C, Dorigatti I, Dye C, Eckmanns T, Ferguson
#> N, Formenty P, Fraser C, Garcia E, Garske T, Hinsley W, Holmes D,
#> Hugonnet S, Iyengar S, Jombart T, Krishnan R, Meijers S, Mills H,
#> Mohamed Y, Nedjati-Gilani G, Newton E, Nouvellet P, Pelletier L,
#> Perkins D, Riley S, Sagrado M, Schnitzler J, Schumacher D, Shah A, Van
#> Kerkhove M, Varsaneux O, Kannangarage N (2015). "West African Ebola
#> Epidemic after One Year — Slowing but Not Yet under Control." _The New
#> England Journal of Medicine_. doi:10.1056/NEJMc1414992
#> <https://doi.org/10.1056/NEJMc1414992>.. 
#> To retrieve the citation use the 'get_citation' function

# instead of
cfr_static(
  data = ebola1976[1:30, ],
  delay_density = function(x) density(onset_to_death_ebola, x)
)
#>   severity_mean severity_low severity_high
#> 1         0.955         0.74             1

# using as.function()
cfr_static(
  data = ebola1976[1:30, ],
  delay_density = as.function(onset_to_death_ebola)
)
#>   severity_mean severity_low severity_high
#> 1         0.955         0.74             1

Created on 2024-04-30 with reprex v2.1.0

joshwlambert commented 2 months ago

I just noticed that {simulist} internally uses as.function() to convert it to the density() function. Is this functionality only possibly to generate the density function or also to the quantile(), thinking in {EpiNow2}, or generate(), thinking in {epichains}? (related comment https://github.com/epiverse-trace/tutorials-middle/pull/29#discussion_r1559310192)

as.function() is a base R S3 generic, and {epiparameter} exports an as.function() S3 method for <epidist> objects. See ?epiparameter::as.function.epidist for help. In the documentation you'll see there is a second argument called func_type, this defaults to "density", but can also take "cdf", "generate", and "quantile".

Also, given the internal usage of as.function(), {simulist} looks strongly dependent on {epiparameter}, while compared to {cfr}, this connection is optional. Is this dependency desirable to keep for ecosystem interoperability?

The {simulist} package is dependent on {epiparameter}. This is to enable to most functionality and interoperability between the two packages and has benefitted both packages during development. I am not a core developer on {cfr} so I have not been involved with their decision not to have the same dependency-interoperability setup with that package.

For the reprex you show, if you have {epiparameter} loaded feel free to use the as.function() approach (second example), they are functionally identical, so it is personal preference which is used.

avallecam commented 2 months ago

as.function() is a base R S3 generic, and {epiparameter} exports an as.function() S3 method for <epidist> objects. See ?epiparameter::as.function.epidist for help. In the documentation you'll see there is a second argument called func_type, this defaults to "density", but can also take "cdf", "generate", and "quantile".

ok, this functionality is cool. I think I'm understanding the benefit of making a package to export generic functions S3 methods for specific objects generated by the same package. Possibly through tutorials an howto's we can give to them more visibility.

Sharing reprex as example for epichains (sorry, if unrelated with simulist)

library(epichains)
library(epiparameter)
#> Warning: package 'epiparameter' was built under R version 4.3.3

mers_offspring <- c(mean = 0.60, dispersion = 0.02)

serial_inverval <- epidist_db(
  disease = "mers",
  epi_dist = "serial",
  single_epidist = TRUE
)
#> Using Cowling B, Park M, Fang V, Wu P, Leung G, Wu J (2015). "Preliminary
#> epidemiological assessment of MERS-CoV outbreak in South Korea, May to
#> June 2016." _Eurosurveillance_.
#> doi:10.2807/1560-7917.es2015.20.25.21163
#> <https://doi.org/10.2807/1560-7917.es2015.20.25.21163>.. 
#> To retrieve the citation use the 'get_citation' function

set.seed(32)

sim_chains <- simulate_chains(
  # simulation controls
  index_cases = 5,
  statistic = "size",
  # offspring
  offspring_dist = rnbinom,
  mu = mers_offspring["mean"],
  size = mers_offspring["dispersion"],
  # generation
  generation_time = as.function(serial_inverval, func_type = "generate")
)

head(sim_chains)
#>    infectee_id sim_id infector_id generation     time
#> 6            1      2           1          2 13.30683
#> 7            4      2           1          2 11.01682
#> 8            1      3           1          2 14.42131
#> 9            4      3           1          2 10.64722
#> 10           1      4           1          2 14.84852
#> 11           4      4           1          2 11.58797

Created on 2024-04-30 with reprex v2.1.0

avallecam commented 2 months ago

The {simulist} package is dependent on {epiparameter}. This is to enable to most functionality and interoperability between the two packages and has benefitted both packages during development. I am not a core developer on {cfr} so I have not been involved with their decision not to have the same dependency-interoperability setup with that package.

In understand, possibly different packages may require different degrees of dependency to get the most of the benefit. I thought in API consistency, like making {simulist} to read objects like the pseudocode below, and generalizing the {cfr} vignette on distributions:

sim_linelist(
  contact_distribution = as.function(contact_distribution),
  contact_interval = as.function(contact_interval),
  prob_infect = 0.5,
  onset_to_hosp = as.function(onset_to_hosp),
  onset_to_death = as.function(onset_to_death)
)
joshwlambert commented 2 months ago

Possibly through tutorials an howto's we can give to them more visibility.

I'm happy to collaborate on a tutorial or howto to illustrate this functionality.

sim_linelist(
  contact_distribution = as.function(contact_distribution),
  contact_interval = as.function(contact_interval),
  prob_infect = 0.5,
  onset_to_hosp = as.function(onset_to_hosp),
  onset_to_death = as.function(onset_to_death)
)

I prefer to keep the current function arguments that can accept <epidist> objects or anonymous functions (produced by as.function() or not). By allowing a user to pass an <epidist> directly to sim_linelist(), sim_outbreak() or sim_contacts() it takes the burden off the user to understand what type of distribution function is required.

For example the code chunk pasted above has a mistake, as the contact_interval, onset_to_hosp and onset_to_death argument should specify as.function(..., func_type = "generate"). By letting the user supply an <epidist> this is handled correctly internally.

See https://epiverse-trace.github.io/simulist/articles/simulist.html#using-functions-for-distributions-instead-of-epidist for documentation on this.

joshwlambert commented 2 months ago

As the as.function() method is exported from {epiparameter} I think we should close this issue, and if there is further discussion we can start a new issue on the {epiparameter} repository.

avallecam commented 2 months ago

See https://epiverse-trace.github.io/simulist/articles/simulist.html#using-functions-for-distributions-instead-of-epidist for documentation on this.

thank you for sharing this section. this basically solves my main doubt regarding simulist. {simulist} and {cfr} accept both anonymous and episdist but prioritize them in the documentation in a different order. {simulist} gives visibility first to epidist, while {cfr} to anonymous functions.