epinowcast / epidist

An R package for estimating epidemiological delay distributions
http://epidist.epinowcast.org/
Other
9 stars 3 forks source link

Think more cafefully about custom families #131

Open athowes opened 1 week ago

athowes commented 1 week ago

Currently epidist_family for the epidist_latent_individual is defined as follows

#' @method epidist_family epidist_latent_individual
#' @family latent_individual
#' @export
epidist_family.epidist_latent_individual <- function(data, family = "lognormal",
                                                     ...) {
  brms::custom_family(
    paste0("latent_", family),
    dpars = c("mu", "sigma"),
    links = c("identity", "log"),
    lb = c(NA, 0),
    ub = c(NA, NA),
    type = "real",
    vars = c("pwindow", "swindow", "vreal1"),
    loop = FALSE
  )
}

In working on #114 I've been trying to extract out the posteriors for the gamma distribution so that they match the input simulation settings. I've not found this immediately easy to do and I am suspecting there may be complications we are not currently taking into account. By not find this easy I mean I've not been able to find a transformation of the samples which gets the simulated settings back. This could either be that the model isn't getting the right answer or that there is some issue with link functions etc.

I think that both the parameters of the gamma distribution have domain $\mathbb{R}^+$ i.e. non-negative In this code we hard code in that the link functions are links = c("identity", "log"). This seems questionable (to have identity link for something non-negative. I suspect this is causing problems with the sampler. As a solution we might want to enable users to have more control of the link functions.

While I'm unsure about this, I'm going to remove the gamma parameter recovery integration test from #114:

test_that("epidist.epidist_latent_individual recovers the simulation settings for the delay distribution in the gamma delay case", { # nolint: line_length_linter.
  fit_gamma <- epidist(
    data = prep_obs_gamma,
    family = epidist_family(prep_obs_gamma, family = "gamma"),
  )
  draws <- posterior::as_draws_df(fit_gamma$fit)
  # draws$Intercept
  # draws$Intercept_sigma
})

As a part of a solution to this issue, I think we should add it back (passing).

athowes commented 1 week ago

I think here I also need to understand the parameterisation of the gamma distribution that brms uses.

Sam suggests it's what Stan does.

athowes commented 1 week ago

Furthermore, this test fails with 10% divergent transitions:

test_that("epidist.epidist_latent_individual fits and the MCMC converges for a gamma delay distribution", { # nolint: line_length_linter.
  fit_gamma <- epidist(
    data = prep_obs_gamma,
    family = epidist_family(prep_obs_gamma, family = "gamma"),
  )
  expect_s3_class(fit_gamma, "brmsfit")
  expect_s3_class(fit_gamma, "epidist_fit")
  expect_convergence(fit_gamma)
})

Likely pointing at a bigger issue with the gamma implementation that I think should be fixed here.

athowes commented 1 week ago