Open cmungall opened 9 years ago
Sorry for forgetting about this issue.
All annotations with HOM:0000007 are synapomorphies for the sub-taxa of the annotated taxon.
It should now be easy to extract homoplasies. From the "ancestral_taxa" file, any Uberon term appearing more than once in this file is a putative homoplasy: https://github.com/BgeeDB/anatomical-similarity-annotations/blob/master/release/ancestral_taxa_homology_annotations.tsv
E.g., we have: UBERON:0000988 pons - 8782 Aves UBERON:0000988 pons - 40674 Mammalia
What is more tricky is when Uberon has (correctly) created several terms in case of homoplasy (i.e., if you created a "Aves pons" term and a "Mammalia pons" term). You will not be able to recover them. It is in our long-term plans to create such homoplasy mappings.
Can we use the table to extract these?
Many HOM:0000007s will be synapomorphies both the clade in the taxon field. But not all?
It's not totally clear from the guidelines if it's valid to make a conservative HOM:0000007 assertion, where we use a more specific taxon ID. E.g. we may say that all Bilaterian nervous systems are homologous, but we don't want to explicitly rule out wider homology across metazoa. Such a statement would be formally weaker than a synapomorphy assertion.
Also, for the pons case, we may have two valid HOM:0000007 assertions for mammals and aves. I don't think we'd want to say that the pons is a synapmorphy for either. We would I think need to introduce new class IDs for the different types of pons.