INCATools / ontology-access-kit

Ontology Access Kit: A python library and command line application for working with ontologies
https://incatools.github.io/ontology-access-kit/
Apache License 2.0
123 stars 29 forks source link

lexmatch failing for OBO file #52

Closed realmarcin closed 1 year ago

realmarcin commented 2 years ago

This is starting from an OBO file (for the Deep Learning Ontology DLO) generated with robot convert from the original rdf/xml. Links to the input files are at bottom of ticket.

Robot command to generate OBO input: robot convert --input DLO.xrdf --format owl --output DLO.obo

OAK command:

runoak -i DLO.obo lexmatch -o DLO.ssom.tsv

This gives the following error:

/Users/marcin/Documents/ontology/OAK/ontology-access-kit/venv/lib/python3.9/site-packages/pronto/parsers/rdfxml.py:286: SyntaxWarning: <Element '{http://purl.org/dc/elements/1.1/}description' at 0x7fca5a2b2cc0> contains text but no `xsd:datatype`
  meta.annotations.add(self._extract_literal_pv(child))
/Users/marcin/Documents/ontology/OAK/ontology-access-kit/venv/lib/python3.9/site-packages/pronto/parsers/rdfxml.py:286: SyntaxWarning: <Element '{http://purl.org/dc/elements/1.1/}title' at 0x7fca5a2b2d60> contains text but no `xsd:datatype`
  meta.annotations.add(self._extract_literal_pv(child))
/Users/marcin/Documents/ontology/OAK/ontology-access-kit/venv/lib/python3.9/site-packages/pronto/ontology.py:283: NotImplementedWarning: cannot process plain `owl:AnnotationProperty`
  cls(self).parse_from(_handle)  # type: ignore

To reproduce, the DLO input files are here: DLO.xrdf DLO.obo

cmungall commented 2 years ago

DLO.obo is not actually an obo file! It's rdf/xml.

It looks like pronto will sniff the file and attempt to use the rdf/xml parser anyway, but pronto is too strict: see @caufieldjh's issue #49.

we should update our docs - things are moving fast with this lib

See the docs here:

https://incatools.github.io/ontology-access-kit/selectors.html

I renamed the file and it parsed (using latest lib)

runoak -i ~/tmp/dlo.owl lexmatch

or to be more explicit about format:

runoak -I rdfxml -i ~/tmp/dlo.owl lexmatch

this brings success... of sorts. It parses, but it finds no matches. But this is not unexpected as you have a well behaved ontology with unique labels..

your next step is to feed in another ontology. You can use robot to merge ontologies. Or specify extra with -a

E.g . I downloaded stato and tried

runoak -i ~/tmp/dlo.owl -a obolibrary:stato.owl lexmatch 

gives 2 results... progress!

realmarcin commented 2 years ago

Ah, looks like my robot command failed and the obo conversion didn't work.

For the record, here is the DLO obo file from the ontology repo: DLO.obo

cmungall commented 2 years ago

Were you able to get this working?

realmarcin commented 2 years ago

I was able to replicate your steps. However, when I lexmatch against MLO I get this -- possibly something off with the MLO owl file?

(venv) Marcins-MacBook-Pro:ontology-access-kit marcin$ runoak -I rdfxml -i ../../DLO/DLO.xrdf -a obolibrary:ml-ontology-202010021305.owl lexmatch /Users/marcin/Documents/VIMSS/ontology/OAK/ontology-access-kit/venv/lib/python3.9/site-packages/pronto/parsers/rdfxml.py:286: SyntaxWarning: <Element '{http://purl.org/dc/elements/1.1/}description' at 0x7fcdd96de2c0> contains text but no xsd:datatype meta.annotations.add(self._extract_literal_pv(child)) /Users/marcin/Documents/VIMSS/ontology/OAK/ontology-access-kit/venv/lib/python3.9/site-packages/pronto/parsers/rdfxml.py:286: SyntaxWarning: <Element '{http://purl.org/dc/elements/1.1/}title' at 0x7fcdd96de360> contains text but no xsd:datatype meta.annotations.add(self._extract_literal_pv(child)) /Users/marcin/Documents/VIMSS/ontology/OAK/ontology-access-kit/venv/lib/python3.9/site-packages/pronto/ontology.py:283: NotImplementedWarning: cannot process plain owl:AnnotationProperty cls(self).parse_from(_handle) # type: ignore Traceback (most recent call last): File "/Users/marcin/Documents/VIMSS/ontology/OAK/ontology-access-kit/venv/bin/runoak", line 8, in sys.exit(main()) File "/Users/marcin/Documents/VIMSS/ontology/OAK/ontology-access-kit/venv/lib/python3.9/site-packages/click/core.py", line 1130, in call return self.main(args, kwargs) File "/Users/marcin/Documents/VIMSS/ontology/OAK/ontology-access-kit/venv/lib/python3.9/site-packages/click/core.py", line 1055, in main rv = self.invoke(ctx) File "/Users/marcin/Documents/VIMSS/ontology/OAK/ontology-access-kit/venv/lib/python3.9/site-packages/click/core.py", line 1654, in invoke super().invoke(ctx) File "/Users/marcin/Documents/VIMSS/ontology/OAK/ontology-access-kit/venv/lib/python3.9/site-packages/click/core.py", line 1404, in invoke return ctx.invoke(self.callback, ctx.params) File "/Users/marcin/Documents/VIMSS/ontology/OAK/ontology-access-kit/venv/lib/python3.9/site-packages/click/core.py", line 760, in invoke return __callback(args, **kwargs) File "/Users/marcin/Documents/VIMSS/ontology/OAK/ontology-access-kit/venv/lib/python3.9/site-packages/oaklib/cli.py", line 144, in main impls = [get_implementation_from_shorthand(d) for d in add] File "/Users/marcin/Documents/VIMSS/ontology/OAK/ontology-access-kit/venv/lib/python3.9/site-packages/oaklib/cli.py", line 144, in impls = [get_implementation_from_shorthand(d) for d in add] File "/Users/marcin/Documents/VIMSS/ontology/OAK/ontology-access-kit/venv/lib/python3.9/site-packages/oaklib/selector.py", line 42, in get_implementation_from_shorthand return res.implementation_class(res) File "", line 6, in init File "/Users/marcin/Documents/VIMSS/ontology/OAK/ontology-access-kit/venv/lib/python3.9/site-packages/oaklib/implementations/pronto/pronto_implementation.py", line 75, in __post_init ontology = Ontology.from_obo_library(resource.slug) File "/Users/marcin/Documents/VIMSS/ontology/OAK/ontology-access-kit/venv/lib/python3.9/site-packages/pronto/ontology.py", line 206, in from_obo_library return cls( File "/Users/marcin/Documents/VIMSS/ontology/OAK/ontology-access-kit/venv/lib/python3.9/site-packages/pronto/ontology.py", line 283, in init__ cls(self).parse_from(_handle) # type: ignore File "/Users/marcin/Documents/VIMSS/ontology/OAK/ontology-access-kit/venv/lib/python3.9/site-packages/pronto/parsers/rdfxml.py", line 89, in parse_from raise ValueError("could not find owl:Ontology element") ValueError: could not find owl:Ontology element