rdmenezes / ontowiz

Automatically exported from code.google.com/p/ontowiz
0 stars 0 forks source link

Hi, seems like a db-xref is not unique for a term #7

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago

What steps will reproduce the problem?
1. Import ontology named "fungal_anatomy.obo"
2. For term-id "FAO:0001014", having db="FAO" and acc="mcc", call function 
"onto->get_term_by_xref("FAO", "mcc");
3. The term returned is of id "FAO:0001010", which in this context is wrong.

What is the expected output? What do you see instead?
We expected term-id "FAO:FAO:0001014", but i.e. the result was "FAO:0001010", 
which is wrong.

Investigating this problem, I had a peek inside the ontology 
"fungal_anatomy.obo" (an extract is attached), and discovered that a dbxref for 
a term is not unique (which contradicts with assumptions in ONTO-PERL), a 
function which is included below:
-----the function of interest in ONTO-PERL....Ontology.pm--------------
sub get_term_by_xref {
    my ($self, $db, $acc) = @_;
        my $result;
    if ($db && $acc) {
            foreach my $term (@{$self->get_terms()}) { # return the exact occurrence                                                     
                        $result = $term;
                        foreach my $xref ($term->xref_set_as_string()) {
                                return $result if (($xref->db() eq $db) && ($xref->acc() eq $acc));
                }
            }
        }
        return undef;
}
---------------------------------------------------------------

= Suggestions = 
I therefore suggest to alternatives to handle this situation:
(a) Return all terms which matches our query, or
(b) Assert that our input-ontology is wrong, i.e. due nothing.

If you could have a look upon these two alternatives, and then give me 
feedback, I'd be thankful! ;)

Original issue reported on code.google.com by oeks...@gmail.com on 19 Jul 2013 at 8:41

Attachments:

GoogleCodeExporter commented 9 years ago
Hi Ole,

let's discuss this on Monday.

vlmir

Original comment by vladimir.n.mironov@gmail.com on 20 Jul 2013 at 6:04

GoogleCodeExporter commented 9 years ago
Hmm; do not see why we need to discuss it; what we need is a concrete reference 
to a standardisation-page which explicitly states if uniqueness is required. 
Given my first brief search uniqueness is not required, i.e. it's a bug in 
ONTO-PERL. My challenge to both of you (Erick and Vladimir) is to provide a 
reference* (to OBO/OWL/RDF/etc.) documentation which contradicts my (latter) 
conclusion, i.e. to prove that a given (db, accession-number) pair exists for 
only a single term. 
-------------------
* Without a formal reference supporting our claim it is (given my own view) 
completely wrong to say that ontology-developers have bugs (which is the 
implication of ONTO-PERL's approach).

Original comment by oeks...@gmail.com on 20 Jul 2013 at 6:46

GoogleCodeExporter commented 9 years ago
HI Ole,

indeed, a combination db-accession should be unique. Could you give example
where it is not true?
vlmir

Original comment by vladimir.n.mironov@gmail.com on 21 Jul 2013 at 9:48

GoogleCodeExporter commented 9 years ago
An example is provided in the attached file (extracts_from_fungal_anatomy.obo) 
(which is described in the initial error-report at the "top" of this page, i.e. 
you have probably looked at it already). 
-- Though as you are asking, I conclude that the example does not satisfy your 
needs. -- I therefore attach an error-message which ontoWiz has generated for 
"spatial.obo": the error-message is less intuitive (compared to 
"extracts_from_fungal_anatomy.obo"), though with your deep knowledge of 
ontologies I hope that the 27 examples of non-unique (db, acc) pairs should be 
enough to end this discussion.

Original comment by oeks...@gmail.com on 21 Jul 2013 at 2:10

Attachments:

GoogleCodeExporter commented 9 years ago
Ole,
I'm afraid we may have misunderstanding. The combination db-accession
should be unique for any given term yet any number of terms may have a
reference to the same source (db-accession).
Vladimir

Original comment by vladimir.n.mironov@gmail.com on 21 Jul 2013 at 6:14

GoogleCodeExporter commented 9 years ago
Hmm, from you arguments it seems like you are not reading my attachments, nor 
my writing: when "get_term_by_xref()" is called, it returns a single item, 
though a set of elements should be returned. Your arguments does not reflect 
this case. I therefore end this issue, and conclude that ONTO-PERL misbehaves. 

Original comment by oeks...@gmail.com on 21 Jul 2013 at 7:09

GoogleCodeExporter commented 9 years ago

Original comment by oeks...@gmail.com on 21 Jul 2013 at 7:11

GoogleCodeExporter commented 9 years ago
I do not understand why you expect: "FAO:FAO:0001014" as a result ?

that sub in perl expects arguments such as:

db=FAO
acc=0001014

dbxrefs are present in an OBO file at several levels : definitions, term 
itself, synonyms, etc... please review the spec

Original comment by erick.an...@gmail.com on 22 Jul 2013 at 2:37

GoogleCodeExporter commented 9 years ago
----------------
FYI: I had originally given up this post, as the answers I got were 
meaningless. Though as it seems like you try understanding the issue, I will 
now give it a new try:
----------------
"I do not understand why you expect: "FAO:FAO:0001014" as a result ?"
-- Negative: for 
---------
db=FAO
acc=mcc
---------
I expect "onto->get_term_by_xref("FAO", "mcc");" to return "FAO:0001010". The 
problem is that I get a different term instead, as several terms have "xref: 
FAO:mcc".
--> Given my effort to outline the issue (given your honest try of 
understanding it), did you now get the issue?

Original comment by oeks...@gmail.com on 26 Jul 2013 at 8:27

GoogleCodeExporter commented 9 years ago
ok, I see the point now, the misleading part was  "We expected term-id 
"FAO:FAO:0001014", but i.e. the result was "FAO:0001010", which is wrong."

I thought you referred to the duplication of FAO -> "FAO:FAO" ... Anyway, you 
should only get ONE time that namespace (i.e. FAO).

By "giving up", do you mean you won't fix it? forget it?

Original comment by erick.an...@gmail.com on 26 Jul 2013 at 12:52

GoogleCodeExporter commented 9 years ago
>> By "giving up", do you mean you won't fix it? forget it?
-- I meant that I'd return an array of all elements returned, and not the first 
found. 

Now to the point: do you see the bug of the return-statement in the middle of 
the for-each loop (for ONTO-PERL included at the top of this issue-page)?
-- if you can't see the bug, then look upon the attachment I provided: what 
term does (i.e. should) the function return for "db=FAO", "acc=mcc"?

Original comment by oeks...@gmail.com on 26 Jul 2013 at 1:03

GoogleCodeExporter commented 9 years ago
I saw it!

That sub should indeed return an array of all terms that have such a xref.

Original comment by erick.an...@gmail.com on 26 Jul 2013 at 1:09

GoogleCodeExporter commented 9 years ago
Thanks, both for you attitude and willingness to resolve this issue :)
-- For the future, in order to improve our efficiency of work (as it seems like 
you are a man worth discussing with!), if you have suggestions for how 
resolving such issues faster, I'd be thankful for feedback! ;) 

Wish you a happy day, and thanks for your support/help! ;)

Original comment by oeks...@gmail.com on 26 Jul 2013 at 1:30

GoogleCodeExporter commented 9 years ago
Thanks to you!

as a matter of fact, that issue should also be fixed in the following sub:

get_instance_by_xref

cheers

Original comment by erick.an...@gmail.com on 26 Jul 2013 at 1:36

GoogleCodeExporter commented 9 years ago
>> get_instance_by_xref
-- Thanks; then I've updated ontoWiz and committed tha change to our 
google-code-repo; the smallest ontologies in the CCO pipeline have passed my 
tests, which indicates that our bug is resolved :)

>> Thanks to you!
-- Then it seems like we are colleagues, i.e. as it seems like we share the joy 
of correct and working ontoWiz/ONTO-PERL and the interest in achieving this 
goal; thanks! ;)

Snakkes ;)

Original comment by oeks...@gmail.com on 26 Jul 2013 at 2:02