Closed ntorresd closed 7 months ago
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 70.32%. Comparing base (
ff8b90e
) to head (be7828b
).:exclamation: Current head be7828b differs from pull request most recent head aefa582. Consider uploading reports for the commit aefa582 to get more accurate results
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
@ntorresd - in this line: https://github.com/epiverse-trace/serofoi/blob/bc1902e2cda3fcc69666627eebecd2c70d22c8d9/R/seroprevalence_data.R#L145
we can just use 'age' instead of 'exposure_age' which might imply the age at which you got exposed to the infection ...
@ntorresd - I could simulate data with seroreversion rate being > 0, i.e., assuming there's seroreversion. Have you been unable to do this?
The solution you have in the code works both when mu = 0 and when mu != 0
@ntorresd, @ben18785 - for this probability function: https://github.com/epiverse-trace/serofoi/blob/bc1902e2cda3fcc69666627eebecd2c70d22c8d9/R/seroprevalence_data.R#L151
you can check or test it against the solution that assumes constant FOI with seroreversion:
constant_FOI_solution <- function(age, foi, mu) {
#foi * (1 - exp(-a * (foi + mu))) / (foi + mu) ## this line works the same as below ..
(foi/(foi + mu)) * (1 - exp(-a * (foi + mu)))
}
I have checked, and the constant FOI function works with and w/out seroreversion, i.e., mu == 0 and mu != 0, and gives the same output/simulations as the age FOI function when foi is a single fixed value ...
let me know if this makes sense - assuming that's what we were discussing ...
Adding the R code I was using to test the function(s) - might be helpful:
library(tidyverse)
library(rstan)
rstan_options(auto_write = TRUE)
options(mc.cores = 4)
simulate_foi_age_exact <- function(age, fois, chunks, mu) {
I <- 0
# solves ODE exactly within pieces
for(i in 1:age) {
lambda <- fois[chunks[i]]
I <- (1 / (lambda + mu)) * (lambda + exp(-(lambda + mu)) * (I * (lambda + mu) - lambda))
}
I
}
simulate_foi_age_exact2 <- function(age, fois, chunks, mu) {
I <- 0
# solves ODE exactly within pieces
for(i in 1:age) {
lambda <- fois[chunks[i]]
I <- (1 / (lambda + mu)) * exp(-(lambda + mu)) * (lambda * (exp(lambda + mu) - 1) + I * (lambda + mu))
}
I
}
# the above two functions seem to work the same - give the same output!
ages <- seq(1, 40, 1)
#-----------------------------------------------------------------------------------------------
# (1) Model without seroreversion
# generate synthetic data assuming we have data for all individuals aged between 1-n
mu <- 0.0 # no seroreversion
fois <- c(0.01, 0.05, 0.1, 0.03)
chunks <- unlist(map(seq(1, 4, 1), ~rep(., 10))) # how many FOIs to estimate
prob_infected_simulated <- map_dbl(ages, ~simulate_foi_age_exact(., fois, chunks, mu))
prob_infected_simulated2 <- map_dbl(ages, ~simulate_foi_age_exact2(., fois, chunks, mu))
tibble(age=ages, R_function=prob_infected_simulated, R_function2=prob_infected_simulated2) %>%
pivot_longer(-age) %>%
ggplot(aes(x=age, y=value, colour=name)) +
geom_line() + geom_point() + labs(y = "Probability infected") + guides(colour = "none") +
facet_wrap(.~ name)
# checking with constant FOI
constant_solution <- function(a, foi, mu) {
#foi * (1 - exp(-a * (foi + mu))) / (foi + mu) # Ben's solution ..
(foi/(foi + mu)) * (1 - exp(-a * (foi + mu))) # what's usually published
}
prob_infected_cst <- map_dbl(ages, ~constant_solution(., fois[1], mu))
fois <- rep(fois[1], length(ages)) # constant - for each age!
prob_infected_simulated <- map_dbl(ages, ~simulate_foi_age_exact(., fois, chunks, mu))
prob_infected_simulated2 <- map_dbl(ages, ~simulate_foi_age_exact2(., fois, chunks, mu))
tibble(age=ages, R_function=prob_infected_simulated, R_function2=prob_infected_simulated2,
cst_function=prob_infected_cst) %>%
pivot_longer(-age) %>%
ggplot(aes(x=age, y=value, colour=name)) +
geom_line() + geom_point() + labs(y = "Probability infected") + guides(colour = "none") +
facet_wrap(.~ name)
#-----------------------------------------------------------------------------------------------
#2) With seroreversion:
mu <- 0.1 # seroreversion
fois <- c(0.01, 0.05, 0.1, 0.03)
chunks <- unlist(map(seq(1, 4, 1), ~rep(., 10))) # how many FOIs to estimate
prob_infected_simulated <- map_dbl(ages, ~simulate_foi_age_exact(., fois, chunks, mu))
prob_infected_simulated2 <- map_dbl(ages, ~simulate_foi_age_exact2(., fois, chunks, mu))
tibble(age=ages, R_function=prob_infected_simulated, R_function2=prob_infected_simulated2) %>%
pivot_longer(-age) %>%
ggplot(aes(x=age, y=value, colour=name)) +
geom_line() + geom_point() + labs(y = "Probability infected") + guides(colour = "none") +
facet_wrap(.~ name)
# checking with constant FOI
constant_solution <- function(a, foi, mu) {
#foi * (1 - exp(-a * (foi + mu))) / (foi + mu) # Ben's solution ..
(foi/(foi + mu)) * (1 - exp(-a * (foi + mu))) # what's usually published
}
prob_infected_cst <- map_dbl(ages, ~constant_solution(., fois[1], mu))
fois <- rep(fois[1], length(ages)) # constant - for each age!
prob_infected_simulated <- map_dbl(ages, ~simulate_foi_age_exact(., fois, chunks, mu))
prob_infected_simulated2 <- map_dbl(ages, ~simulate_foi_age_exact2(., fois, chunks, mu))
tibble(age=ages, R_function=prob_infected_simulated, R_function2=prob_infected_simulated2,
cst_function=prob_infected_cst) %>%
pivot_longer(-age) %>%
ggplot(aes(x=age, y=value, colour=name)) +
geom_line() + geom_point() + labs(y = "Probability infected") + guides(colour = "none") +
facet_wrap(.~ name)
@ntorresd, @ben18785 - for this probability function:
you can check or test it against the solution that assumes constant FOI with seroreversion:
constant_FOI_solution <- function(age, foi, mu) { #foi * (1 - exp(-a * (foi + mu))) / (foi + mu) ## this line works the same as below .. (foi/(foi + mu)) * (1 - exp(-a * (foi + mu))) }
I have checked, and the constant FOI function works with and w/out seroreversion, i.e., mu == 0 and mu != 0, and gives the same output/simulations as the age FOI function when foi is a single fixed value ...
let me know if this makes sense - assuming that's what we were discussing ...
You're right, the functions work pretty well!
I added a valued based test based on your R code for large sample sizes. I also corrected the way we were testing all the data simulation functions, so I recommend you take a look at https://github.com/epiverse-trace/serofoi/pull/158/commits/5c72282c26de2397c5359dc43ad790decfd738e8 @ben18785 and @ekamau.
Right now I'm choosing the tolerance to compare the simulated seroprevalence values with the exact probabilities in a rather arbitrary way... I tried doing something similar to what I did for the modelling module (see https://github.com/epiverse-trace/serofoi/pull/153/commits/22244db2e6c5b7873112f33aab6bd3a2997f1c76), where I use as a tolerance half the size of the credible interval; I tried to use here the binomial confidence interval as a criteria, but weirdly enough the tests always passed for arbitrary sample size when I do it (which is nice, but is not very useful for testing), so I chose a value (tolerance = 10^-3
) that gives good results for a large sample size (10^7) and fails for small sample size (~100).
@ntorresd - thanks for making the changes. it looks good; no further comments at this stage.
This PR modifies
get_sim_probability
to allow for data simulation from age-varying FoI with seroreversion and includes seroreversion to the time-varying FoI option (@ben18785 please check if I'm computing this correctly).is_time_varying
andis_age_varying
can be used to specify whether age-varying or time-varying FoI is being used, with the time varying option beingTRUE
as default.TRUE
, the user is warned about the lack of time and age-varying data simulation functionalities and the age varying option is selected.FALSE
, the user is warned and the time-varying option is selected.mu
corresponds to the seroreversion rate. If set to 0 (default), there's no seroreversion.This PR closes #74 (I'll move the discussion about the age-varying models to a separate issue).