microbiome / mia

Microbiome analysis
https://microbiome.github.io/mia/
Artistic License 2.0
45 stars 25 forks source link

Merging trees #569

Closed TuomasBorman closed 2 days ago

TuomasBorman commented 4 weeks ago

https://github.com/microbiome/mia/blob/5ed79efaf889f9927841d3d2d846dc4ccafa5ae8/R/splitOn.R#L388

Instead of combining trees, this function calculates new tree or adds tree from subset. This is not how it should be done. It should combine the trees to single tree.

Also mergeSEs is not working optimally. https://github.com/microbiome/mia/blob/5ed79efaf889f9927841d3d2d846dc4ccafa5ae8/R/mergeSEs.R#L472

It finds the minimum subset of trees that can present the whole data. For instance, if there are 10 trees, but the rows can be presented with 2 trees, then it adds these two trees to merged object. However, I think it should merge these trees into single large tree.

Sometimes the TreeSE object can have unique taxa that is found only from certain object. This means that if we merge 10 TreeSEs, the output has 10 trees. This is not optimal.

Suggestion:

  1. Investigate capabilities of ape::bind.tree.
    • Does it bind whole trees, i.e., if two trees have both 10 tips, is the number of tips in output always 20?
    • The trees being merged can have also overlapping taxa which means that this kind of taxa should be presented in output only once.
  2. Create a general internal function that takes multiple trees (and links/rownames) as input.
  3. The function combines the set of trees so that each row can be found from the tree.
  4. The function outputs a single tree.
TuomasBorman commented 4 weeks ago

This is partly duplicated. I forgot this issue that I made couple weeks ago https://github.com/microbiome/mia/issues/558#issuecomment-2151943425