Open petrelharp opened 6 years ago
this was from a question by @molpopgen; tagging @jeromekelleher for future thoughts - not sure if this is the right repo to put this in. =)
This looks good @petrelharp, and the code chunk is clear. I don't understand the motivation though --- what do you mean by retaining substitutions, and how does forcing a grand MRCA help you with this?
Some questions require tracking fixations, which occur prior to the MRCA of each marginal tree. Keeping the root as a sample will retain that extra branch from MRCA to the initial root.
On Mon, Jan 8, 2018, 1:21 AM Jerome Kelleher notifications@github.com wrote:
This looks good @petrelharp https://github.com/petrelharp, and the code chunk is clear. I don't understand the motivation though --- what do you mean by retaining substitutions, and how does forcing a grand MRCA help you with this?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ashander/ftprime/issues/59#issuecomment-355915751, or mute the thread https://github.com/notifications/unsubscribe-auth/AGHnH1zHUDKEToFyC5Rxv2yPjtsLKuUCks5tId3PgaJpZM4RVNp2 .
--
Kevin Thornton
Associate Professor
Ecology and Evolutionary Biology
UC Irvine
Some questions require tracking fixations, which occur prior to the MRCA of each marginal tree. Keeping the root as a sample will retain that extra branch from MRCA to the initial root.
I see... So the goal is to retain information about when the fixation happened (plus other information?) as well as keeping the site (even though it's fixed?). If all we want to do is keep the site, then we can do this by just adding the filter_zero_mutation_sites=False
flag to simplify. Perhaps we could then add some metadata to the site recording when fixation occured, and whatever else that is of interest (I'm assuming the forward time simulator will be aware of this)?
I think I'm missing something though...
If all we want to do is keep the site, then we can do this by just adding the filter_zero_mutation_sites=False flag to simplify.
Wouldn't this result in retaining both fixed and extinct variants?
The motivation was to be able to do a MK test, which involves the number of substitutions. There could be a special flag for this, or way of recording it, but I think just keeping the root around as a "sample" would be a fairly efficient way to do it: wouldn't the only downside be an increasing number of rows in the Site and Mutation tables that need to get passed through simplify? Non-optimal, but totally do-able.
I don' tthink this is urgent, btw; it just came up and I wrote down how to do it.
Wouldn't this result in retaining both fixed and extinct variants?
Yes, yes it would...
I don' tthink this is urgent, btw; it just came up and I wrote down how to do it.
OK, cool. I think I'm just being slow. The code looks good to me anyway.
Yes, yes it would...
"The new algorithm efficiently fills all available RAM in linear time."
Sometimes (eg for the MK test) it'd be good to retain substitutions. To do this, we'd need to establish a root, mark it as a sample, and keep it in through simplification. I'm recording how to do this here; perhaps it should be a vignette somewhere.
... now, we append to the tables, and next time we simplify, we just need to add
[current_root]
to the list of samples. And repeat.This doesn't deal with mutations, of course. If we add mutations after adding the root, then all is good. If we have already got mutations on the initial tree sequence, then we'd need to add mutations to the new edges we've just added that go back to that root.