OpenTreeOfLife / otcetera

C++20 lib for manipulations of phylogenetic trees and supertree operations
Other
4 stars 3 forks source link

Parent-offspring taxon reversal in synth #117

Closed snacktavish closed 1 year ago

snacktavish commented 1 year ago

From Chris Meacham on Gitter:

Here's a problem I noticed with the synthetic tree version 13.4. There are several cases where a taxon is shown on the tree as the immediate descendant of a taxon that it should be the immediate ancestor of.

I first noticed this with Equisitaceae, a plant family (horsetails), which is shown on the tree as the immediate ancestor of Equisitales, an order.

The odd thing is that the OTOL taxonomy shows this relationship correctly. See https://tree.opentreeoflife.org/taxonomy/browse?id=719032 where the ancestor of Equisitaceae is shown correctly as Equisitales. Now go to the current tree https://tree.opentreeoflife.org/opentree/argus/ottol@719032 where Equisitaceae is shown as the immediate ancestor of the order Equisitales. This is not correct.

Here is a list of the 14 places in synthetic tree 13.4 where a family is shown as the immediate ancestor of the order that it should be the immediate descendant of. The first column is the uid of the order. The second column is the name of the order, then the name of the family.

719029 Equisetales Equisetaceae 663734 Calobryales Haplomitriaceae 620612 Timmiales Timmiaceae 620617 Diphysciales Diphysciaceae 813592 Ahnfeltiales Ahnfeltiaceae 617750 Polymixiiformes Polymixiidae 819004 Heterodontiformes Heterodontidae 570650 Beroida Beroidae 19507 Tritirachiales Tritirachiaceae 423329 Mariprofundales Mariprofundaceae 622718 Elusimicrobiales Elusimicrobiaceae 974234 Anaerolineales Anaerolineaceae 818912 Ktedonobacterales Ktedonobacteraceae 1039775 Methanopyrales Methanopyraceae

There are also 9 places where a class is the descendant of the order that it should be the ancestor of.

235122 Andreaeopsida Andreaeales 758427 Takakiopsida Takakiales 235114 Sphagnopsida Sphagnales 622391 Gymnolaemata Cheilostomatida 1051895 Wallemiomycetes Wallemiales 283885 Chlorobia Chlorobiales 220320 Fibrobacteria Fibrobacterales 437903 Synergistia Synergistales 1005610 Dictyoglomia Dictyoglomales

And there is one place where a phylum is the descendant of the class that it should be the ancestor of.

5268016 Euglenida Euglenophyceae

Anyway, this should probably be looked into.

I figured that the taxonomy/taxon merging was somehow to blame - but these relationships are correct in the taxonomy, and are getting reversed in synth somewhere.

bredelings commented 1 year ago

OK, so I started looking into Equisitaceae (https://tree.opentreeoflife.org/taxonomy/browse?id=719032)

The local taxonomy looks like: Equisetidae (ott7577693) > Equisetales (ott719029) > Equisitaceae (719032), but the synth tree switches the order of the last two: https://tree.opentreeoflife.org/opentree/opentree13.4@ott719032/Equisetaceae

It turns out that Equisetidae (ott7577693) occurs in grafted_solution.tre, but the others do not. So presumably, the need to be unpruned, and something goes wrong in otc-unprune-solution.

mtholder commented 1 year ago

well the good news is that the problem can be reproduced on a small problem (rooting the synth at https://tree.opentreeoflife.org/taxonomy/browse?id=166292) It is a problem with otc-unprune-solution-and-name-unnamed-nodes, as otc-unprune-solution does not show the behavior.

bredelings commented 1 year ago

This should be fixed by @mtholder's patch.