phyloref / phylo2owl

Tool to convert phylogenies to OWL ontologies
MIT License
4 stars 2 forks source link

Writing portable node-based definitions #26

Open gaurav opened 7 years ago

gaurav commented 7 years ago

The one node-based definition we have is:

(has_Child some (has_Descendant value Campanula_laciniata and excludes_lineage_to value Campanula_pelviformis)) and (has_Child some (has_Descendant value Campanula_pelviformis and excludes_lineage_to value Campanula_laciniata))

This works fine on tree 3:

screen shot 2017-03-21 at 6 30 55 pm

Unfortunately, this doesn't work on tree 1:

Clade H on tree 1

There is no node that has_Descendant value Campanula_laciniata and excludes_lineage_to value Campanula_pelviformis. However, the node-based definition -- the last common ancestor of Campanula laciniata and Campanula pelviformis does exist. So how do we write that node-based phyloreference that works on both of these trees?

hlapp commented 7 years ago

That's a good case that shows some of the limitations of phyloreferencing when taxon (specifier) sampling is very low, and phylogenetic relationships aren't stable.

Note that for Tree 1, the MRCA of Campanula laciniata and Campanula pelviformis is not the one labeled with 'H', but a child of it, and hence excludes Campanula carpatha, whereas for Tree 3 it is the node labeled with 'H' and does include Campanula carpatha. Remind me how does the paper define 'H'?

ncellinese commented 7 years ago

Well, there is a reason why support value is very low on Tree 2. :-) Yes, I can see that is a problem but am I catching up here? Am I missing something? I am so sorry I have been distracted by personal stuff and it is not quite over for me. Maybe we, I, would benefit from a call today and talk about this.

Nico

On Mar 21, 2017, at 6:32 PM, Gaurav Vaidya <notifications@github.com mailto:notifications@github.com> wrote:

The one node-based definition we have is:

(has_Child some (has_Descendant value Campanula_laciniata and excludes_lineage_to value Campanula_pelviformis)) and (has_Child some (has_Descendant value Campanula_pelviformis and excludes_lineage_to value Campanula_laciniata))

This works fine on tree 3 http://journals.plos.org/plosone/article/figure/image?size=large&id=10.1371/journal.pone.0094199.g003:

https://cloud.githubusercontent.com/assets/23979/24173933/9391189e-0e64-11e7-9064-8c3cd6f5eada.png Unfortunately, this doesn't work on tree 1 http://journals.plos.org/plosone/article/figure/image?size=large&id=10.1371/journal.pone.0094199.g001:

https://cloud.githubusercontent.com/assets/23979/24173770/db3ab39a-0e63-11e7-954c-535ac47567ef.png There is no node that has_Descendant value Campanula_laciniata and excludes_lineage_to value Campanula_pelviformis. However, the node-based definition -- the last common ancestor of Campanula laciniata and Campanula pelviformis does exist. So how do we write that node-based phyloreference that works on both of these trees?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/phyloref/phylo2owl/issues/26, or mute the thread https://github.com/notifications/unsubscribe-auth/ACaXwS2tPGLUA0LaXOlCNiIyVScsgsuKks5roE_tgaJpZM4Mkfk1.

gaurav commented 7 years ago

@hlapp In this paper, clade H is described as:

This Cretan clade is recovered with high support. Campanula pelviformis, C. carpatha, C. laciniata, and C. tubulosa are all endemic to Crete and Karpathos islands except C. laciniata, which is also found in the Cyclades islands. This clade is likely the result of a single introduction into the Cretan area and one of the few examples of in situ diversification in the Cretan Campanuloideae [5].

It also says that:

Dating analyses support the hypothesis that Clade H may have been the result of an in situ radiation in the Cretan area [5] with the stem of this clade dating to approximately 9 million years old and diversification of the crown clade estimated at approximately 6 million years ago (Fig. 5)."

That [5] refers to Cellinese et al., 2009, where clade H is identified as clade 3. Clade 3 from that paper looks like this:

Clade 3 from Cellinese et al., 2009

And is described thusly:

The split between C. carpatha and its sister clade dates to 4.5 (± 2) Ma, but the stem lineage of the carpatha–laciniata clade dates to 9 (± 3) Ma, suggesting that the ancestors of this clade may have been present on the islands before the Pliocene. Lack of support within the laciniata–carpatha clade prevents us from stating whether diversification occurred first in the Karpathos groups with dispersal into Crete, or vice versa. In any case, this clade seems to represent the only in situ radiation event in the Cretan area. It is possible that this radiation resulted from the presence of a distinct ecological niche. However, this hypothesis is hard to test because of the significant human and animal pressure exerted in these islands for more than 6000 years (Evans, 1971; Broodbank & Strasser, 1991), which has dramatically shaped their biodiversity and species distribution.

So: it looks to me like clade H is interesting because it consists of endemics on Crete, but I can't find a specific definition.

@ncellinese No worries! This was my first attempt at a node-based definition by using excludes_lineage_to, but it doesn't work in this case because on tree 1 the lineage being excluded incorporates all the ancestors of Campanula laciniata. So I think we're going to have to come up with a cleverer way of expressing node-based definitions in OWL that can find the MRCA of two leaf nodes under any topology.