monarch-initiative / owlsim-v3

Ontology Based Profile Matching
16 stars 5 forks source link

Implement CURIE handling #24

Closed cmungall closed 7 years ago

cmungall commented 8 years ago

Clients should be able to use CURIEs

Currently v3 expects full IRIs

handling could be in services layer or core; core would be better

kshefchek commented 7 years ago

+1

cmungall commented 7 years ago

Maybe this should be a separate library? ccing @balhoff who is facing something similar for minerva. @jamesaoverton wrote IOUtils in ROBOT which could be a good starting point for a separate module. Or is that overengineering?

jamesaoverton commented 7 years ago

ROBOT's IOHelper uses OWLAPI's PrefixManager and some JSON-LD stuff. You're welcome to use it, of course, or we can try and factor it out somehow.

I've built a dozen simple systems with prefix maps and match-and-replace over the years. That's a lot lighter if you don't need more features.

cmungall commented 7 years ago

For this, we need something super-lightweight, single level of expansion/contraction iri<->curie, with defined behavior and rules for ambiguity in the contraction.

It would be useful to have a general library that uses the json-ld standard to allow for contraction/expansion from arbitrary short forms, using a context file. I think this should be straightforward too

balhoff commented 7 years ago

Unfortunately JSON-LD context is a lot less straightforward than one might think. The contraction & expansion is dependent on how the term is used in JSON (e.g. JSON property vs. value for @type property vs. value for other properties, etc.). Then there is interplay with @base and @vocab (which also depend on usage position in JSON). So you kind of need a real JSON-LD library to parse the context, which is what both robot and pxftools do. This can still be wrapped in a simple API. In pxftools I started with robot's approach and added a parameter to the expansion methods so you could say whether you want it to happen as a property or class vs. as an instance.

jnguyenx commented 7 years ago

@cmungall can you provide a JSON-LD file example? Or you prefer to go the yaml way?

jamesaoverton commented 7 years ago

This is the default JSON-LD context used in ROBOT: https://github.com/ontodev/robot/blob/master/robot-core/src/main/resources/obo_context.jsonld

This is the awkward way that I ended up using JsonLdApi: https://github.com/ontodev/robot/blob/master/robot-core/src/main/java/org/obolibrary/robot/IOHelper.java#L483

cmungall commented 7 years ago

Thanks. And the setting of the context is fairly straightforward: https://github.com/ontodev/robot/blob/master/robot-core/src/main/java/org/obolibrary/robot/IOHelper.java#L598-L613

@jnguyenx

jnguyenx commented 7 years ago

Pushed to Maven Central: http://search.maven.org/#artifactdetails%7Corg.prefixcommons%7Ccurie-util%7C0.0.1%7Cjar

jnguyenx commented 7 years ago

Some matchers throw CURIEs exceptions: http://owlsim3.monarchinitiative.org/api/match/bayesian-network?id=HP%3A0001347&id=HP%3A0000718&id=HP%3A0001332&id=HP%3A0001268&id=HP%3A0001257&id=HP%3A0006892

To investigate

ERROR [2017-01-27 00:37:45,248] io.dropwizard.jersey.errors.LoggingExceptionMapper: Error handling a request: 213e7af645d9a3a6
! java.lang.IllegalStateException: Optional.get() cannot be called on an absent value
! at com.google.common.base.Absent.get(Absent.java:47) ~[owlsim-services-3.0-SNAPSHOT.jar:na]
! at org.monarchinitiative.owlsim.kb.impl.BMKnowledgeBaseOWLAPIImpl.getOWLClass(BMKnowledgeBaseOWLAPIImpl.java:1010) ~[owlsim-services-3.0-SNAPSHOT.jar:na]
! at org.monarchinitiative.owlsim.kb.impl.BMKnowledgeBaseOWLAPIImpl.getClassIndex(BMKnowledgeBaseOWLAPIImpl.java:628) ~[owlsim-services-3.0-SNAPSHOT.jar:na]
! at org.monarchinitiative.owlsim.compute.cpt.impl.ThreeStateConditionalProbabilityIndex.calculateConditionalProbabilities(ThreeStateConditionalProbabilityIndex.java:107) ~[owlsim-services-3.0-SNAPSHOT.jar:na]
! at org.monarchinitiative.owlsim.compute.matcher.impl.ThreeStateBayesianNetworkProfileMatcher.calculateConditionalProbabilities(ThreeStateBayesianNetworkProfileMatcher.java:137) ~[owlsim-services-3.0-SNAPSHOT.ja
r:na]
jnguyenx commented 7 years ago

This error was thrown due to a missing prefix for http://uri.neuinfo.org/nif/nifstd/nlx_28443. In case of a not found curie, the IRI will be used as output.

Note that the above query still fails, but it seems to be due to the matcher itself.