naturalis / supersmart

Self-Updating Platform for the Estimation of Rates of Speciation, Migration And Relationships of Taxa
MIT License
17 stars 5 forks source link

Clade trees having less than two exemplar species #88

Open hettling opened 9 years ago

hettling commented 9 years ago

In the primates example, we have two clades, Trachypithecus and Lepilemur that are discarded when grafting onto the backbone, since their clade trees only have one exemplar.

The exemplars are most likely excluded during clademerge when we build a graph connecting species by their respective markers and choosing the largest connected subset of species.

hettling commented 9 years ago

Reason are the parameters CLADE_MIN_DENSITY and CLADE_MIN_COVERAGE: For the species Trachypithecus auratus there are many alignments, but just one made the cut to be included in the clade during decomposition. Since CLADE_MIN_COVERAGE was 2, the species did not end up in the markers table of that clade.

We should emit a warning about that at the right moment, maybe during bbdecompose, keep track of the alignment count for each species within the clade while iterating over the alignments (alns_for_taxa) and warn if there are less than CLADE_MIN_COVERAGE

hettling commented 9 years ago

Commit d3756aab791fa6f2c132a162276a198e23a78cb8 adresses this partly: Now we warn if a taxon is in less clade alignments than CLADE_MIN_COVERAGE. However, there is still the possibility that a species is in enough alignments which are then merged together in smrt-clademerge and that species then won't make the cut. This is the case for exemplar Trachypithecus auratus.