magikker / TreeHouse-Private

TreeHouse development.
GNU General Public License v3.0
0 stars 0 forks source link

Build a k-tet consensus tree. #32

Open magikker opened 11 years ago

magikker commented 11 years ago

If we think about a bipartition as defining the relationship between all taxa in a tree over an edge, and a quartet is defining the relationship of 4 taxa over an edge, we can think of a k-tet as defining the relationship between k-taxa over an edge.

If we can expose the k-tets (quartet, 5-tet, 6-tet, n-1 tet) in a set of trees, find the ones that in common among the set and build a tree out of those common k-tets we would have a consensus algorithm that is robust to a few misplaced taxa.

Since bipartitions define the placement of all taxa relative to an edge in a tree, they aren't very robust to misplaced taxa. For instance there's a good example in the literature that takes a tree, makes a copy, swaps two of the 10 or so taxa in the copy, and now the two trees have no bipartitions in common. Even though most of their internal structure is similar. This is often seen as an issue when computing the RF distance between trees as bipartitions aren't always robust to small changes, but it can be an issue in consensus as well.

Quartets are really robust to small changes. If you just move a few taxa around it won't disrupt many quartets. This is one of the reasons people like quartets. The problem is that quartets are slow.... really slow... there's alot of them to deal with.

But what if we could look relationships that define the placement of almost all the taxa. ... Things that are almost bipartitions... But not quite. If we could find the common ones and build a consensus tree out of them we'd have a consensus method that's probably a lot faster that a quartet consensus but more robust than a bipartition consensus.