epiverse-trace / quickfit

[SUSPENDED] Toolbox of model fitting helper functions
https://epiverse-trace.github.io/quickfit/
Other
2 stars 1 forks source link

Passing uncertainty between functions/packages #10

Open adamkucharski opened 1 year ago

adamkucharski commented 1 year ago

As well as calculating a profile likelihood, it would also be possible to numerically calculate the posterior probability for the 1 or 2 parameter likelihoods considered in quickfit.

A couple of potential use cases:

Then there's question of how best to store such a posterior, e.g. whether as a numerical vector, or perhaps as a fitted kernel density() object?

Some example code of an implementation that could align with existing functions in quickfit (Poisson has a conjugate prior, of course, so could make calculation simpler for this specific example, but idea is to show how might work for a general likelihood function and prior):

# Define parameter range
a <- 0
b <- 6
step_size <- 0.01

# Simulate Poisson data
set.seed(1)
sim_data <- rpois(10, 3)

# Define prior
prior_p1 <- function(x) { dnorm(x, 5, 1, log = F) } # normal distribution for lambda
prior_p2 <- function(x) { 1 } # uniform distribution

# Define likelihood
log_l <- function(x) {dpois(sim_data, x, log = TRUE) |> sum() }

# Construct posterior sample:
xx_grid <- seq(a,b,step_size)
pp_out <- sapply(xx_grid,log_l)

# Plot likelihood profile
plot(xx_grid,pp_out,ylab="log likelihood",xlab="lambda",type="l")

# Calculate priors numerically, normalised to give pdf
prior_1 <- prior_p1(xx_grid)/sum(prior_p1(xx_grid))
prior_2 <- prior_p2(xx_grid)/sum(prior_p2(xx_grid))

# Calculate posteriors
# P(x | data) = P(data | x)*P(x)/P(data)

posterior_p1 <- exp(pp_out)*prior_1; posterior_p1 <- posterior_p1/sum(posterior_p1)
posterior_p2 <- exp(pp_out)*prior_2; posterior_p2 <- posterior_p2/sum(posterior_p2)

# Plot posteriors
plot(xx_grid,posterior_p1,ylab="P(x | data)",xlab="lambda",type="l",col="orange")
lines(xx_grid,posterior_p2,col="blue")
lines(xx_grid,prior_p1(xx_grid)/sum(prior_p1(xx_grid)),lty=2,col="orange") # plot prior

# Confidence interval
ci_out <- xx_grid[which(pp_out>(max(pp_out)-1.92))]
ci_out <- c(xx_grid[which.max(pp_out)],min(ci_out),max(ci_out))
ci_out

# Credible interval
cr_out <- cumsum(posterior_p2)
c(xx_grid[which.max(posterior_p2)],max(xx_grid[which(cr_out<0.025)]),min(xx_grid[which(cr_out>0.975)]))

# Note that as simulation size increases, CI and CrI converge for uniform prior
adamkucharski commented 1 year ago

To give a tangible case study of where this kind of requirement has been needed in reality, early COVID scenario modelling collated a distribution of R0 from published estimates (see below), then simulated trajectories over a distribution of values for R0 (based on a summary parametric distribution) to provide uncertainty bounds for projections.

r0_estimates