epiverse-trace / epiparameter

R package with library of epidemiological parameters for infectious diseases and functions and classes for working with parameters
https://epiverse-trace.github.io/epiparameter
Other
33 stars 11 forks source link

Add `aggregate()` method for `<epiparameter>` #388

Closed joshwlambert closed 1 month ago

joshwlambert commented 1 month ago

This PR addresses #271 by adding an aggregate() method for the <multi_epiparameter> class. This is a feature that has been repeatedly requested and was worked on in a working group at an Epiverse-Imperial hackathon earlier this year.

That hackathon led to PR #270 which was sketched by @Bisaloo and others in the working group, and subsequently worked on by myself. However, the approach to aggregate <epiparameter> objects raised several concerns. One of which was the output of several aggregation methods being numerical vectors rather than parameterised distribution, and led to #372.

After discussing this feature with @adamkucharski, @chartgerink and @Bisaloo it was agreed to take a different approach to try and offer similar functionality, that has led to this PR.

This PR adds an aggregate() method to combine a list of <epiparameter> objects and output a parameterised mixture distribution, leveraging functionality and classes exported by {distributional}. The output of aggregate.multi_epiparameter() is an <epiparameter> object, with class methods updated in this PR to work seamlessly with mixture distributions.

A get_parameters.multi_epiparameter() method is added. The .get_mixture_family() internal function is added to easily access the individual distribution names from a mixture distribution (used for printing).

Unit tests have been added for the aggregate() method.

joshwlambert commented 1 month ago

Here is an example of the new functionality:

library(epiparameter)
ebola_si <- epiparameter_db(epi_dist = "serial interval", disease = "ebola")
#> Returning 4 results that match the criteria (4 are parameterised). 
#> Use subset to filter by entry variables or single_epiparameter to return a single entry. 
#> To retrieve the citation for each use the 'get_citation' function
ebola_si
#> # List of 4 <epiparameter> objects
#> Number of diseases: 1
#> ❯ Ebola Virus Disease
#> Number of epi distributions: 1
#> ❯ serial interval
#> [[1]]
#> Disease: Ebola Virus Disease
#> Pathogen: Ebola Virus
#> Epi Distribution: serial interval
#> Study: WHO Ebola Response Team, Agua-Agum J, Ariyarajah A, Aylward B, Blake I,
#> Brennan R, Cori A, Donnelly C, Dorigatti I, Dye C, Eckmanns T, Ferguson
#> N, Formenty P, Fraser C, Garcia E, Garske T, Hinsley W, Holmes D,
#> Hugonnet S, Iyengar S, Jombart T, Krishnan R, Meijers S, Mills H,
#> Mohamed Y, Nedjati-Gilani G, Newton E, Nouvellet P, Pelletier L,
#> Perkins D, Riley S, Sagrado M, Schnitzler J, Schumacher D, Shah A, Van
#> Kerkhove M, Varsaneux O, Kannangarage N (2015). "West African Ebola
#> Epidemic after One Year — Slowing but Not Yet under Control." _The New
#> England Journal of Medicine_. doi:10.1056/NEJMc1414992
#> <https://doi.org/10.1056/NEJMc1414992>.
#> Distribution: gamma
#> Parameters:
#>   shape: 2.188
#>   scale: 6.490
#> 
#> [[2]]
#> Disease: Ebola Virus Disease
#> Pathogen: Ebola Virus
#> Epi Distribution: serial interval
#> Study: WHO Ebola Response Team, Agua-Agum J, Ariyarajah A, Aylward B, Blake I,
#> Brennan R, Cori A, Donnelly C, Dorigatti I, Dye C, Eckmanns T, Ferguson
#> N, Formenty P, Fraser C, Garcia E, Garske T, Hinsley W, Holmes D,
#> Hugonnet S, Iyengar S, Jombart T, Krishnan R, Meijers S, Mills H,
#> Mohamed Y, Nedjati-Gilani G, Newton E, Nouvellet P, Pelletier L,
#> Perkins D, Riley S, Sagrado M, Schnitzler J, Schumacher D, Shah A, Van
#> Kerkhove M, Varsaneux O, Kannangarage N (2015). "West African Ebola
#> Epidemic after One Year — Slowing but Not Yet under Control." _The New
#> England Journal of Medicine_. doi:10.1056/NEJMc1414992
#> <https://doi.org/10.1056/NEJMc1414992>.
#> Distribution: gamma
#> Parameters:
#>   shape: 4.903
#>   scale: 3.161
#> 
#> [[3]]
#> Disease: Ebola Virus Disease
#> Pathogen: Ebola Virus
#> Epi Distribution: serial interval
#> Study: WHO Ebola Response Team, Agua-Agum J, Ariyarajah A, Aylward B, Blake I,
#> Brennan R, Cori A, Donnelly C, Dorigatti I, Dye C, Eckmanns T, Ferguson
#> N, Formenty P, Fraser C, Garcia E, Garske T, Hinsley W, Holmes D,
#> Hugonnet S, Iyengar S, Jombart T, Krishnan R, Meijers S, Mills H,
#> Mohamed Y, Nedjati-Gilani G, Newton E, Nouvellet P, Pelletier L,
#> Perkins D, Riley S, Sagrado M, Schnitzler J, Schumacher D, Shah A, Van
#> Kerkhove M, Varsaneux O, Kannangarage N (2015). "West African Ebola
#> Epidemic after One Year — Slowing but Not Yet under Control." _The New
#> England Journal of Medicine_. doi:10.1056/NEJMc1414992
#> <https://doi.org/10.1056/NEJMc1414992>.
#> Distribution: gamma
#> Parameters:
#>   shape: 2.068
#>   scale: 7.301
#> 
#> [[4]]
#> Disease: Ebola Virus Disease
#> Pathogen: Ebola Virus
#> Epi Distribution: serial interval
#> Study: WHO Ebola Response Team, Agua-Agum J, Ariyarajah A, Aylward B, Blake I,
#> Brennan R, Cori A, Donnelly C, Dorigatti I, Dye C, Eckmanns T, Ferguson
#> N, Formenty P, Fraser C, Garcia E, Garske T, Hinsley W, Holmes D,
#> Hugonnet S, Iyengar S, Jombart T, Krishnan R, Meijers S, Mills H,
#> Mohamed Y, Nedjati-Gilani G, Newton E, Nouvellet P, Pelletier L,
#> Perkins D, Riley S, Sagrado M, Schnitzler J, Schumacher D, Shah A, Van
#> Kerkhove M, Varsaneux O, Kannangarage N (2015). "West African Ebola
#> Epidemic after One Year — Slowing but Not Yet under Control." _The New
#> England Journal of Medicine_. doi:10.1056/NEJMc1414992
#> <https://doi.org/10.1056/NEJMc1414992>.
#> Distribution: gamma
#> Parameters:
#>   shape: 1.898
#>   scale: 6.532
#> 
#> # ℹ Use `parameter_tbl()` to see a summary table of the parameters.
#> # ℹ Explore database online at: https://epiverse-trace.github.io/epiparameter/articles/database.html
consensus_ebola_si <- aggregate(ebola_si)
density(consensus_ebola_si, at = 2)
#> [1] 0.02318493
plot(consensus_ebola_si)

Created on 2024-10-02 with reprex v2.1.0

joshwlambert commented 1 month ago

The different probability distributions in the <multi_epiparameter> can be checked using: sapply(ebola_incub, family) (or lapply() or vapply() etc.).

Below I have pasted an example using the incubation periods for Ebola showing that different types of distributions can be included in the mixture distribution. In this case there is a lognormal distribution and four gamma distributions.

library(epiparameter)
ebola_incub <- epiparameter_db(epi_dist = "incubation", disease = "ebola")
#> Returning 5 results that match the criteria (5 are parameterised). 
#> Use subset to filter by entry variables or single_epiparameter to return a single entry. 
#> To retrieve the citation for each use the 'get_citation' function

# get probability distribution of each <epiparameter>
sapply(ebola_incub, family)
#> [1] "lnorm" "gamma" "gamma" "gamma" "gamma"

# remove subtype info for pathogens to match
for (i in seq_along(ebola_incub)) {
  ebola_incub[[i]]$pathogen <- "Ebola Virus"
}

ebola_incub_mix <- aggregate(ebola_incub)
ebola_incub_mix
#> Disease: Ebola Virus Disease
#> Pathogen: Ebola Virus
#> Epi Distribution: incubation period
#> Study: Eichner M, Dowell S, Firese N (2011). "Incubation period of ebola
#> hemorrhagic virus subtype zaire." _Osong Public Health and Research
#> Perspectives_. doi:10.1016/j.phrp.2011.04.001
#> <https://doi.org/10.1016/j.phrp.2011.04.001>.
#> Study: WHO Ebola Response Team, Agua-Agum J, Ariyarajah A, Aylward B, Blake I,
#> Brennan R, Cori A, Donnelly C, Dorigatti I, Dye C, Eckmanns T, Ferguson
#> N, Formenty P, Fraser C, Garcia E, Garske T, Hinsley W, Holmes D,
#> Hugonnet S, Iyengar S, Jombart T, Krishnan R, Meijers S, Mills H,
#> Mohamed Y, Nedjati-Gilani G, Newton E, Nouvellet P, Pelletier L,
#> Perkins D, Riley S, Sagrado M, Schnitzler J, Schumacher D, Shah A, Van
#> Kerkhove M, Varsaneux O, Kannangarage N (2015). "West African Ebola
#> Epidemic after One Year — Slowing but Not Yet under Control." _The New
#> England Journal of Medicine_. doi:10.1056/NEJMc1414992
#> <https://doi.org/10.1056/NEJMc1414992>.
#> Distribution: mixture: lnorm, gamma, gamma, gamma, gamma
#> Parameters:
#>   dist.mu: 2.487
#>   dist.sigma: 0.330
#>   dist.shape: 1.578
#>   dist.rate: 0.153
#>   dist.shape: 0.925
#>   dist.rate: 0.073
#>   dist.shape: 1.731
#>   dist.rate: 0.173
#>   dist.shape: 1.462
#>   dist.rate: 0.141
#>   w1: 0.200
#>   w2: 0.200
#>   w3: 0.200
#>   w4: 0.200
#>   w5: 0.200

Created on 2024-10-04 with reprex v2.1.0