cerebis / meta-sweeper

Parametric sweep of simulated microbial communities and metagenomic sequencing.
GNU General Public License v3.0
10 stars 0 forks source link

phylogenetic tree generation #31

Closed cerebis closed 8 years ago

cerebis commented 8 years ago

It would be nice if you could generate trees within the pipeline. To that end, I have been prototyping a solution using the DendroPy module.

koadman commented 8 years ago

agree, this could be nice. BTW, BEAST is really really good at simulating trees from models ranging from simple birth/death or Yule processes to very complicated models. One way to do it is to specify the model you want in BEAUTi and give it an alignment that just has a single gap character for each sequence. Then when the resulting .xml is run with BEAST it will sample from the prior distribution, because gaps are treated as unknown data.

cerebis commented 8 years ago

I implemented something using DendroPy. Lacking a good understanding, I just chose the birth_death algorithm, but there were others.

Any of them take your fancy? At least the B&D simulation doesn't require a sequence. I haven't looked into the others (would assume dB&D is similar though).

cerebis commented 8 years ago

http://dendropy.org/library/treesim.html

koadman commented 8 years ago

hard to beat the convenience of a python API call! that collection of tree distributions should be plenty until we start getting into some really hard and focused questions

koadman commented 8 years ago

by hard and focused, i mean looking at things like population demographic histories with models like those in BEAST (e.g. skyline, skyride etc), where direct simulation from the prior may not be so easy. gah, 3am here. i should be sleeping.

cerebis commented 8 years ago

That's the sort of detailed requirement I too thought would be better deferred to when it was required. So long as the workflow supports the idea of tree generation then it shouldn't be difficult to substitute tools.

Another recent publication:

Mallo, D., De Oliveira Martins, L., & Posada, D. (2016). SimPhy: Phylogenomic Simulation of Gene, Locus, and Species Trees. Syst Biol, 65(2), 334–344. http://doi.org/10.1093/sysbio/syv082

But I expect Beast is preferable being so well established.

cerebis commented 8 years ago

Reopening this until integrated in sweep. Integration overlaps significantly with issue #28

cerebis commented 8 years ago

I have added star generation commit: 5b653e3d14dc8f7ce0be57dd308d853f0afab6df. This should eliminate the need for explicit trees as files.