mrc-ide / epireview

https://mrc-ide.github.io/epireview/
GNU General Public License v3.0
25 stars 2 forks source link

What is "Mean sd" in Ebola parameters? #85

Open joshwlambert opened 1 month ago

joshwlambert commented 1 month ago

Looking through the Ebola data set there are several parameter entries that state $distribution_par2_type as "Mean sd". When I first read this I assumed it was the standard deviation of the mean. However, I've read two of the papers where this is reported Rosello et al. (2015) and Chan and Nishiura (2020) and in both cases, my interpretation of the results is they report the mean and standard deviation of the distribution.

Another reason for the confusion is that in the Lassa parameters some are reported as "Mean" and "Standard deviation" for $distribution_par1_type and $distribution_par2_type, respectively. If the Ebola data is the mean and standard deviation of the distribution it would be good to standardise this across pathogens.

Below I've pasted some reproducible examples showing in code which entries I'm mentioning.

Ebola data

ebola_data <- epireview::load_epidata("ebola")
#> Warning: One or more parsing issues, call `problems()` on your data frame for details,
#> e.g.:
#>   dat <- vroom(...)
#>   problems(dat)
#> Warning in load_epidata_raw(pathogen, "outbreak"): No data found for ebola
#> Warning: One or more parsing issues, call `problems()` on your data frame for details,
#> e.g.:
#>   dat <- vroom(...)
#>   problems(dat)
#> Warning in epireview::load_epidata("ebola"): No outbreaks information found for
#> ebola
#> Data loaded for ebola
ebola_params <- ebola_data$params
which(ebola_params$distribution_par1_type == "Mean sd")
#> integer(0)
which(ebola_params$distribution_par2_type == "Mean sd")
#>  [1] 364 365 366 367 368 371 374 375 376 377 378 381 382 383 384 385 388 389 390
#> [20] 391 461 468 952 953 956
ebola_params[which(ebola_params$distribution_par2_type == "Mean sd"), ]
#> # A tibble: 25 × 77
#>    id     parameter_data_id covidence_id pathogen parameter_type parameter_value
#>    <chr>  <chr>                    <int> <chr>    <chr>                    <dbl>
#>  1 a9b0b… e0d452d5aea72b33…         2594 Ebola v… Human delay -…           12.9 
#>  2 a9b0b… 9baf758dc2bc1fe3…         2594 Ebola v… Human delay -…            5.02
#>  3 a9b0b… c70a61d876605ae6…         2594 Ebola v… Human delay -…            9.47
#>  4 a9b0b… f0d612742167f390…         2594 Ebola v… Human delay -…            5.72
#>  5 a9b0b… 79b99cad7ce7f90f…         2594 Ebola v… Human delay -…            4.5 
#>  6 a9b0b… 4704e8616bdf9d2c…         2594 Ebola v… Human delay -…            0   
#>  7 a9b0b… d79a599fd2882bfa…         2594 Ebola v… Human delay -…           10.0 
#>  8 a9b0b… 4b9e9266adfda812…         2594 Ebola v… Human delay -…            0   
#>  9 a9b0b… 365cd27a1c10c648…         2594 Ebola v… Human delay -…            7.62
#> 10 a9b0b… 7cffbd19447c4390…         2594 Ebola v… Human delay -…            1.5 
#> # ℹ 15 more rows
#> # ℹ 71 more variables: exponent <dbl>, parameter_unit <chr>,
#> #   parameter_lower_bound <dbl>, parameter_upper_bound <dbl>,
#> #   parameter_value_type <chr>, parameter_uncertainty_single_value <dbl>,
#> #   parameter_uncertainty_singe_type <chr>,
#> #   parameter_uncertainty_lower_value <dbl>,
#> #   parameter_uncertainty_upper_value <dbl>, …
ebola_params[which(ebola_params$distribution_par2_type == "Mean sd"), ]$article_label
#>  [1] "Rosello 2015 (1)" "Rosello 2015 (1)" "Rosello 2015 (1)" "Rosello 2015 (1)"
#>  [5] "Rosello 2015 (1)" "Rosello 2015 (2)" "Rosello 2015 (3)" "Rosello 2015 (2)"
#>  [9] "Rosello 2015 (2)" "Rosello 2015 (2)" "Rosello 2015 (2)" "Rosello 2015 (4)"
#> [13] "Rosello 2015 (3)" "Rosello 2015 (3)" "Rosello 2015 (3)" "Rosello 2015 (3)"
#> [17] "Rosello 2015 (5)" "Rosello 2015 (4)" "Rosello 2015 (4)" "Rosello 2015 (4)"
#> [21] "Lau 2017 (a)"     "Lau 2017 (b)"     "Chan 2020"        "Chan 2020 (1)"   
#> [25] "Chan 2020 (2)"

Created on 2024-06-14 with reprex v2.1.0

Lassa data

lassa_data <- epireview::load_epidata("lassa")
#> Warning: One or more parsing issues, call `problems()` on your data frame for details,
#> e.g.:
#>   dat <- vroom(...)
#>   problems(dat)
#> One or more parsing issues, call `problems()` on your data frame for details,
#> e.g.:
#>   dat <- vroom(...)
#>   problems(dat)
#> One or more parsing issues, call `problems()` on your data frame for details,
#> e.g.:
#>   dat <- vroom(...)
#>   problems(dat)
#> Data loaded for lassa
lassa_params <- lassa_data$params
which(!is.na(lassa_params$distribution_par1_type))
#> [1] 261 262
which(!is.na(lassa_params$distribution_par2_type))
#> [1] 261 262
lassa_params[which(!is.na(lassa_params$distribution_par1_type)), ]$distribution_par1_type
#> [1] "Mean" "Mean"
lassa_params[which(!is.na(lassa_params$distribution_par1_type)), ]$distribution_par2_type
#> [1] "Standard deviation" "Standard deviation"

Created on 2024-06-14 with reprex v2.1.0