bodkan / slendr

Population genetic simulations in R 🌍
https://bodkan.net/slendr
Other
54 stars 5 forks source link

Correctly handle "truncated" models #113

Closed bodkan closed 4 months ago

bodkan commented 2 years ago

In some cases, having a slendr population defined to be created after the period of time specified in compile_model() by simulation_length = leads to issues with the msprime back end. Weird errors such as negative times not valid and Input error in initialise: Attempt to sample a lineage from an inactive population. This is not a problem with slim() because if a forward simulation finishes early, that's OK. Handling this at the level of compile_model() is not ideal, but msprime() could be a bit smarter in preventing those issues, dropping the offending populations before the simulation is run.

A draft of unit tests that turned out to be too strict in the end, just to get an idea:

test_that("all populations exist within the time of the simulation run (forward)", {
  p1 <- population("p1", time = 1, N = 1000)
  p2 <- population("p2", time = 2000, N = 100, parent = p1)
  p3 <- population("p3", time = 6000, N = 3000, parent = p2)

  expect_error(model <- compile_model(populations = list(p1, p2, p3), generation_time = 1,
                                      simulation_length = 1000),
               "All populations must exist")
})

test_that("all populations exist within the time of the simulation run (backward)", {
  p1 <- population("p1", time = 1000, N = 1000)
  p2 <- population("p2", time = 200, N = 100, parent = p1)
  p3 <- population("p3", time = 100, N = 3000, parent = p2)

  expect_error(model <- compile_model(populations = list(p1, p2, p3), generation_time = 1,
                                      simulation_length = 500),
               "All populations must exist")
})