matsen / pplacer

Phylogenetic placement and downstream analysis
http://matsen.fredhutch.org/pplacer/
GNU General Public License v3.0
74 stars 18 forks source link

`rppr reclass` should suggest updated classifications #193

Closed matsen closed 12 years ago

matsen commented 12 years ago

This extends the work done in #171.

This will only be done with taxids that are less specific than a given rank. The default should be species, but this should be specifiable through a --max-rank flag.

For every rank that is is less specific than the max-rank flag, For every leaf that is not convex at that rank,

habnabit commented 12 years ago

This has been implemented in the branch for #192.

matsen commented 12 years ago

This is totally cool, and seems to work in the examples I checked out. Nice.

It would be wonderful to have a -t flag that would write out the discordance tree at the level convexified. This "suggestion" tree would indicate the new name. I think that a useful and easy-to-implement way to go would be to append the suggested name on to the sequence id. Thus if S001576771 gets reclassified to Pseudomonas monteilii then the new sequence id would be

S001576771 -> Pseudomonas monteilii

This would make it easy to review the reclassifications.

matsen commented 12 years ago

And one more thing. If we could have - replace any non-normal float that would be super. I.e. we shouldn't have any -nan in the avg distances.

Then an explanation of why we don't do the calculation in the docs... because comparing distances between ranks is not so meaningful.

matsen commented 12 years ago

This is going to be an ongoing project but this one and it's friend infer should go ahead and get merged into dev.