Closed sdwfrost closed 10 years ago
Hi @sdwfrost
In response to your first question: I really like the idea of the structured tree extension, and will definitely add this. I agree it's best to do this in MASTER; string processing in R is computationally expensive. In the mean-time, perhaps it would be better to use Python/perl to do the processing you're doing above? Or is it APE's parsing of the Newick files that's causing the slow-down?
Regarding your second: the closest related feature that currently exists in MASTER is the leafCountEndCondition, which stops the simulation once a particular number of leaves has been reached. It's currently fairly basic, however, and you can't specify a population that the leaves must belong to or a reaction that they must be related to.
Actual post-simulation conditioning on leaf count doesn't currently exist at all, unfortunately. However, there are a number of people asking questions about this (myself included!) so I definitely want to add this in before the next release.
Thanks for your suggestions!
Hi @tgvaughan
Thanks for considering the feature. I coded up the text processing in Python, which speeded things up considerably. I also have some code to plot output, plus some more example models. If you think these might be useful (once I've worked out all the kinks), I'd be happy to submit a pull request.
Hi @sdwfrost. Yes, definitely send me a pull request if you have some nice examples! I'd prefer to keep the plotting code outside of the main repository, though. (I have one little R script in there, but that's used in an online tutorial.) Maybe in time we can set up a "contributed utilities" repository though?
Hi again @sdwfrost, I've just put out a new release with a modified LineageSampler that I think should do what you want. I've also included a new LeafCountPostSimCondition that allows you to condition the simulation based on the leaf count. (This conditioning is performed after any sampling/filtering.)
I've updated the wiki page for LineageSampler to describe the new version, and there's a new LeafCountPostSimCondition page which can be reached from either of the InheritanceTrajectory or InheritanceEnsemble pages describing the new conditioning. Also, the following example input files (found in examples/
) might be helpful:
I think this closes the issue, but feel free to reopen it if this doesn't fully address your feature request.
Hi @tgvaughan , is it possible to sample a transmission tree at multiple timepoints using the inheritance post processor, e.g. if i want to sample 10 leaves at sampling times 10 and 20 each, and produce a tree with the sampled leaves and their ancestors?
<inheritancePostProcessor spec='LineageSampler'
nSamples="10"
**samplingTime="10,20"**
reverseTime="false"
noClean="false"/>
Dear @tgvaughan
This is more of a feature request, and hence can be overlooked! I've been looking at structured populations, and I'd like to sample specific numbers of lineages from different populations. I can do this by reading in the full tree, then sampling from that, but my R code to process MASTER trees is very slow (listed below, FYI)
Is it possible to extend LineageSampler to give conditions for the lineages of different populations?
A related issue would be to give conditions such that the number of lineages always exceeds the requested number of lineages. Is there a way to do this?