bayesiandemography / bage

https://bayesiandemography.github.io/bage/
Other
2 stars 0 forks source link

Clarify report_sim() subscript error #51

Closed Olliepike closed 1 month ago

Olliepike commented 1 month ago

Hi John, I written some report_sim() examples below that demonstrate the 'subscript out of bounds' error that I encounter. It is very possible that I'm misusing the function, but there potentially could be an issue. Please could you clarify if I am using the function incorrectly?

Working example found in package documentation

library(bage)
data <- data.frame(region = c("A", "B", "C", "D", "E"),
                   population = c(100, 200, 300, 400, 500),
                   deaths = NA)

## simulation with estimation model same as data-generating model
mod_est <- mod_pois(deaths ~ region,
                    data = data,
                    exposure = population) |>
  set_prior(`(Intercept)` ~ Known(0))

report_sim(mod_est = mod_est,
           n_sim = 10) 

Issue example

I understand that report_sim() is an extension of replicate_data(). It can simulate data from the prior predictive distribution of an unfitted model, and also simulate data from the posterior predictive distribution of a fitted model. The documentation only provides an example of an unfitted model simulation so my understanding could be wrong.

A simple example below works for an unfitted model, but fails when simulating data from the fitted model.

Using an unfitted model - works

I understand this to be a way carrying out Prior Predictive Checks - ie assessing whether a model and prior assumptions are reasonable before fitted to data.

data(deaths)

mod_deaths_unfitted <- mod_pois(deaths ~ age,
                                data = deaths,
                                exposure = ~ ifelse(deaths > 0 & popn == 0, 0.5, popn)) %>%
  set_prior(age ~ RW(s = 1))

report_sim(mod_est = mod_deaths_unfitted, n_sim = 10)

Using a fitted model - fails

Using a fitted model fails for n_sim > 1, resulting in a 'subscript out of bounds' error. This result occurs whether set_n_draw() is omitted or included.

mod_deaths_fitted <- mod_deaths_unfitted %>%
  #set_n_draw(10) %>%
  fit()

report_sim(mod_est = mod_deaths_fitted, n_sim = 10)

Using a 'step1.R' fitted model - fails

You previously shared a 'step1.r' script showcasing the replicate_data() function. Using the births model from that script as an example, the model works for replicate_data(), but likewise fails with report_sim(), resulting in the same 'subscript out of bounds' error.

# Obtain raw data

data("synthetic_greater_manchester_data")

la_mapping <- synthetic_greater_manchester_data$mye %>%
  select(la_code = region, la = laname23) %>%
  unique()

spd <- synthetic_greater_manchester_data$spd %>%
  mutate(age = set_age_open(age, lower = 85),
         age = reformat_age(age)) %>%
  count(age, sex, la = laname23, time, wt = sim_count, name = "spd")

births <- synthetic_greater_manchester_data$births %>%
  inner_join(la_mapping, by = c(region = "la_code")) %>%
  mutate(age = reformat_age(age)) %>%
  count(age, sex, la, time, wt = count, name = "births")

# Births

popn_births <- spd %>%
  filter(age %in% births$age,
         sex == "Female") %>%
  mutate(sex = NULL)

data_births <- inner_join(births, popn_births, by = c("age", "la", "time"))

mod_births <- mod_pois(births ~ age * la + sex + la * time,
                       data = data_births,
                       exposure = spd) %>%
  set_prior(age ~ Sp()) %>%
  set_prior(time ~ RW2()) %>%
  fit()

# Replicate data - works
replicate_data_births <- replicate_data(mod_births, n = 9)

# Simulation report - fails
sim_report <- report_sim(mod_est = mod_births, n_sim = 9)
sim_report
johnrbryant commented 1 month ago

Thanks Ollie. You have discovered a bug in the code: report_sim() is not working properly when mod_est or mod_sim are fitted models. I will fix the bug (by 'unfitting' the models at the start of report_sim(), and also add a vignette clarifying the aims of report_sim(), and spelling out the difference from replicate_data().

johnrbryant commented 1 month ago

I have fixed the bug where report_sim() failed if mod_est or mod_sim had been fitted.

I have also added a vignette (briefly) explaining how to use report_sim().