tgvaughan / remaster

High-fidelity stochastic tree simulation for BEAST 2.
https://tgvaughan.github.io/remaster
GNU General Public License v3.0
6 stars 1 forks source link

Seeding simulations with multiple infections #12

Closed CecileTK closed 3 weeks ago

CecileTK commented 1 month ago

Hello Tim!

Thanks again for the great simulation tool!

I've been using remaster and feast to generate sequence alignments from an SIR-like epidemic. I was wondering whether there was a way to seed the epidemics in multiple individuals. It looked like it was possible in MASTER and you mention it in remaster's documentation:

maxRetries (default 10) This is the maximum number of times (default 10) that the simulator should attempt to produce a tree, in the event that the first simulated tree does not have exactly one root lineage. (Simulated trees may have more than one root lineage in the instance that not all sampled lineages find a common ancestor before the start of the simulation; a situation common for deterministic trajectories. It can also occur when the initial number of individuals in the sampled population is greater than 1.)

Would there be a way to seed to epidemic with multiple lineages? Or as an alternative if that is not possible to seed the epidemic with the same lineage in multiple individuals?

Thank you! Cécile

tgvaughan commented 1 month ago

Dear Cécile,

You can for sure seed epidemics with multiple individuals, infected or not. To do this, simply change the initial values assigned to the population parameters. E.g.

<population id='I' spec='RealParameter' value='100'/>

should cause the "I" compartment to be initially seeded with 100 individuals.

What remaster can't do is generate trees with multiple roots. Thus while you can for sure simulate trajectories with this setup, if the sampled tree lineages fail to find a common ancestor before the start of the process remaster will produce an error, since the resulting ancestry can't be represented using a beast tree. (This is different to master, which allowed simulations to produce networks etc, but that proved to be in many ways much more trouble than it was worth.)

Hope this helps,

Tim

CecileTK commented 1 month ago

Thank you, that's helpful!

And just to confirm, does this mean that I can't generate a sequence alignment from such a simulation as this would require building the tree or is there a way to generate sequences simply from trajectories?

tgvaughan commented 1 month ago

That's right - you need a tree to simulate a sequence alignment, thus the above approach wouldn't be sufficient.

If you did want to produce multiple trees from a single simulation, you could try to use the technique outlined here, defining multiple samplePopulation, one for each starting individual, then define sampling unique sampling reactions for the descendants of each. Finally, you'd use PrunedTree to log trees specific to each sample type. That would allow you to simulate distinct trees evolving in parallel within a single epidemic.

This approach might be a bit painful though, as the grammar wasn't designed with this use-case in mind. The subsequent sequence simulation also requires you answer difficult questions such as, "How are the individual starting sequences initiallized?"