Closed jamesmbaazam closed 5 months ago
In likelihood
, it's part of a censoring limit, and a way to include the possibility of R>1 in the inference. See Blumberg & Lloyd-Smith, 2013 section 2.4:
To create a general model that incorporates the possibility of non-extinction when R0 > 1, a more formal approach would be to include an arbitrarily large censoring limit in the likelihood calculation for the maximum observable chain size.". Setting the chains with
>=stat_max
tostat_max
is done internally convenience to later allow identification of those chains, but the user never sees that this happens.
In simulate
, its meaning is the same (so I think it makes sense to have the same argument) but now the user is exposed to the results - they're set to Inf
to mean "this has been censored" but of course the same could perhaps be achieved by setting them to stat_max
, which the user has to set after all.
One way or the other, the Issue highlights that the documentation is not clear, which we should try to address. Perhaps it would make sense to leave the code as is but say in the doc that any outbreaks of size greater than stat_max
(or a renamed argument) will be treated as if it were of infinite size, and that this represents a censoring limit when doing a likelihood calculation?
The definition of
stat_max
insimulate_chains()
is not entirely correct and in particular, the part that says "Results abovestat_max
are set tostat_max
." I think this is a remnant from the original definition inbpmodels::chain_sim()
.stat_max
insimulate_chains()
is now the upper bound of the tracked statistic above which chain simulations are halted. The default ofInf
means that no upper bound is used. We could add further explanation and guidance to say that it is used internally to ensure that outbreaks with R0 > 1 do not continue forever, so when a user sets an offspring mean above 1 coupled with an infinite population, they need to choose a finite value. Furthermore,stat_max
insimulate_chains()
should probably be renamed to something likechain_sim_cutoff
to remove confusion between it andstat_max
insimulate_summary()
andlikelihood()
, which is the value above which stat is set toInf
in the former and set tostat_max
in the latter. Renaming it insimulate_chains()
will resolve #163.On the above point, we need to consider using a single definition & operation that can be inherited between
simulate_summary()
andlikelihood()
as these varying definitions might be a pointer to design considerations that need to be improved.@sbfnk Methodologically, what would be wrong with doing
stat_track[stat_track >= stat_max] <- stat_max
? insimulate_summry()
below?https://github.com/epiverse-trace/epichains/blob/0941451525900125a3dade95b1f8c317d71a4043/R/simulate.r#L414
to align with what's done in
likelihood()
?https://github.com/epiverse-trace/epichains/blob/0941451525900125a3dade95b1f8c317d71a4043/R/likelihood.R#L97