epiverse-trace / epichains

[Under active development] Methods for simulating and analysing the sizes and lengths of infectious disease transmission chains from branching process models
https://epiverse-trace.github.io/epichains/
Other
5 stars 2 forks source link

Interpret offspring distribution as degree distribution #204

Open sbfnk opened 6 months ago

sbfnk commented 6 months ago

The branching process model assumes that the probability of becoming infected is independent of the subsequent offspring distribution. This is unrealistic if the branching process is driven by heterogeneity in contacts. An offspring distribution that assumes a random infinite network can be derived from the uncorrected distribution by re-weighting the probabilities with the number of contacts (accounting for the initial contact), i.e. where the probability of having $n$ offspring is proportional to $p(n+1)(n+1)$, where $p(n)$ is the probability of having $n$ contacts.

We could add a function that translates between probability distributions interpreted as offspring distributions to probability distributions interpreted as contact distributions in this way.

If implementing this we could potentially then also implement use the analytical likelihood implemented in https://doi.org/10.1371/journal.ppat.1004452 (1.4 in the supplement).

Related to https://github.com/epiverse-trace/simulist/issues/35

sbfnk commented 6 months ago

If going with this we'd probably also want a parameter for probability of infection (such that each "offspring" doesn't necessarily get infected), in the same way as already implemented for depletion of susceptibles. We'd probably also want to keep track of uninfected contacts in the simulation code.

joshwlambert commented 5 months ago

We could add a function that translates between probability distributions interpreted as offspring distributions to probability distributions interpreted as contact distributions in this way.

Could you outline how you would do this as we’re having some discussion on the {simulist} side to revert back to using the offspring distribution for the simulation function, while keeping the network effects that were discussed in {simulist} issue epiverse-trace/simulist#35?

sbfnk commented 5 months ago

That is a good question - I proposed this with the original workflow in mind where we'd know the probability masses that correspond to a given degree distribution. Since https://github.com/epiverse-trace/epichains/pull/188 we don't have this any more and only have the random sampler so I think the only way to do in a general manner this would be via Monte Carlo sampling, i.e. the random sampler for excess degree (with argument n for the number of random excess degrees to generate) given a random sampler for degrees would be a function that does:

The number of offspring could then be generated by binomial sampling from the random excess degrees with probability of infection.

It might make more sense to explain how this could be achieved given a specific distribution than to try and solve the general problem.

joshwlambert commented 5 months ago

Thanks, if I follow correctly it is very similar to the current implementation in {simulist}, with perhaps the exception that is it more generalised to any arbitrary degree distribution?

In terms of what the user specifies and the function signature, would you always ask the user to give the excess degree distribution and the probability of infection, and then document that if the probability is 1 it is equivalent to the offspring distribution? Or are you thinking of making the function signature more complex to expose both cases to the user (i.e. specify either contact/excess degree distribution or offspring distribution).