Closed gaurav closed 7 years ago
@hlapp @ncellinese This is the problem with excludes_lineage_to
I mentioned on our last call. Have a look and tell me what you think!
I'm not seeing the problem. In the tree you give as example, there obviously is no node for which Campanula erinus is a descendant but Campanula drabifolia is not, because their immediate parents are identical and thus their common ancestor. So the phyloreference ought not to resolve, and that indeed it doesn't seems the correct result to me. I'm in fact not sure how your variation can change this, because there is really no node that can satisfy the semantics.
What am I missing?
It's true that there's no semantic problem here, but I think the expression:
has_Descendant
valueCampanula_erinus
andexcludes_lineage_to
valueCampanula_drabifolia
ought to reference Campanula_erinus
rather than Nothing: there might be only one explicit individual at the leaf, but there are implicit ancestors between it and the node where the lineages diverged, and I think those should be matched with a branch-based phyloreference.
This could be pretty easy to implement, too: we could add an extra node above every leaf node, which would then match this expression.
Actually no. A clade is not the descendants of its common ancestor or failing that, some descendant. Those are two different things. (You could, if you wanted to, combine with them a UNION
. But that'd be pretty ugly.)
What you are trying to express is something that isn't in the tree, so the result should be nothing. The tree is a real instance; you can't imagine something into it that you postulate ought to be there. Either a common ancestor is there, or it's not there.
If there is no node in the tree that has_Descendant
value Campanula_erinus
and excludes_lineage_to
value Campanula_drabifolia
, then there is not some implicit "fallback" semantics that says it's then the object of has_Descendant
. Neither in OWL, nor in phylogenetic taxonomy.
I tried to think of a concrete example of this (do I has_Descendant
value me
and excludes_lineage_to
value my_sister
), and I guess you're right, there isn't anyone who matches that definition. This does cause problems with portable node-based definitions as we've currently coded those, but I'll open a new issue for that. Closing this one now.
Our phyloreferencing machinery uses
excludes_lineage_to
, which identifies a sibling to some ancestor of the target and that sibling's descendants, but not ancestors of the target themselves. This allows us to construct phyloreferences such as:However, consider tree S21, in which C. erinus and C. drabifolia are found to be sister taxa: this can be represented in Newick as
(Campanula_erinus, Campanula_drabifolia)
, or visually as:In this case, there is no sibling node for our phyloreference to match, since every possible node is an ancestor of the target itself. I got around this by going one node up on the
excludes_lineage_to
:This appears to work correctly for all three phylogenies. So should we recommend that
excludes_lineage_to
always be used with ahas_Child
, or is there a cleverer solution this problem?