phyloref / phylo2owl

Tool to convert phylogenies to OWL ontologies
MIT License
4 stars 2 forks source link

Node-based definition for more than two nodes #28

Open gaurav opened 7 years ago

gaurav commented 7 years ago

For node-based definitions with two internal specifiers A and B, we identify clades A excluding B and clade B excluding A, then use the parent of those two clades. How can we scale this when there are more than two internal specifiers?

We could potentially use has_Descendant relationships to find common ancestors for all internal specifiers, but I can't figure out an OWLish way for determining which one is the least inclusive common ancestor. If we constraint each node so that it can only have a single parent, we might be able to find the node that matches the phyloreference but whose parent does not match the phyloreference. My attempts at getting that to work have crashed the Protege reasoner so far.

A real-world example is the clade Aquarana in Hillis and Wilcox, 2005, which has seven internal specifiers. Its definition reads:

Aquarana Dubois 1992 (converted clade name). Definition: The clade stemming from the most recent common ancestor of Rana catesbeiana Shaw 1802, Rana clamitans Latreille 1802, Rana grylio Stejneger 1901, Rana heckscheri Wright 1924, Rana okaloosae Moler 1985, Rana septentrionalis Baird 1854, and Rana virgatipes Cope 1891. The species used in the definition are those species that were included by Dubois within this group. Content: Includes the species listed as specifiers in the definition. This clade has been informally termed the R. catesbeiana group by previous authors.