monarch-initiative / owlsim-v3

Ontology Based Profile Matching
17 stars 5 forks source link

Leakage of owlsim-server guice configuration into owlsim-core #62

Closed julesjacobsen closed 7 years ago

julesjacobsen commented 7 years ago

I can't quite see the requirement for Guice to be used inside of owlsim-core. By following best practices hopefully anything Guice specific can be removed from the core. I think this will be pretty simple, with one possible major headache... The OWL API has a bunch of classes which are called in the OwlSimServiceApplication:

Injector i = Guice.createInjector(new OWLAPIImplModule(concurrency),
                new OWLAPIParsersModule(),
                new OWLAPIServiceLoaderModule(),
                new KnowledgeBaseModule(configuration.getOntologyUris(), configuration.getOntologyDataUris(), configuration.getDataTsvs(), configuration.getCuries()),
                new EnrichmentMapModule(), 
                new MatcherMapModule());

These OWLAPI* classes are all Guice Modules, but it's not apparent where these are used in the server or core or for what reason. If they are critical then I guess I'll just have to copy these into a Spring configuration and pray they limited the amount of Guice specific stuff to just these classes.

Apart from the OWL API stuff, there are four Guice configuration classes (those extending AbstractModule) classes in the core:

EnrichmentMapModule
MatcherModule (unused, duplicated by MatcherMapModule)
MatcherMapModule
KnowledgeBaseModule

Of these EnrichmentMapModule,MatcherMapModule and MatcherModule (unused, duplicated by MatcherMapModule) are actually only used by the owlsim-services in order to find out which classes are available from the package using reflection and the creating a map of short name to actual instance which is taken from the Guice Injector by calling injector.getInstance(clazz).

The ProfileMatcher and EnrichmentEngine implementation constructors are annotated with @Inject which is as it should be and is part of the JSR-330 spec so can be used by any decent standards compliant Java DI framework. All concrete instances of these interfaces require a concrete instance of BMKnowledgeBase to be passed in by constructor injection.

This is where it gets a bit more complicated. KnowledgeBaseModule is used to wire-up an instance of BMKnowledgeBaseOWLAPIImpl, it's more or less a copy of OWLLoader which also takes a bunch of input terms to produce a BMKnowledgeBase when createKnowledgeBaseInterface() is called. If the OWLLoader provided a Builder it would make constructing a BMKnowledgeBase a lot simpler as it would guide people along - this could the be used by any application to provide a BMKnowledgeBase and wrapped into a DI framework more easily.

So to summarise it looks like EnrichmentMapModule and MatcherMapModule should really move to the owlsim-services sub-project module and MatcherMapModule should be removed. The KnowledgeBaseModule should utilise the OWLLoader to create a singleton instance of BMKnowledgeBase and use that for injection into Guice.

I'm happy to do a PR for this.

cmungall commented 7 years ago

Before you PR consider if #63 will make this simpler.

As for the OWLAPI, in all other applications I use the OWLAPI v4 successfully without DI. However, in owlsim my recollection is that the overall use of DI leaked and we ended up using it.

I think I'm good with moving DI out of the core. I'll chat with @jnguyenx about this today. I would also like to refactor OWLLoader (which is in core), see #57. I'm not sure if this is related. Given that OWLLoader essentially handles configuration (loading a collection of files given a YAML spec) I would have thought this a candidate for DI-ification, but happy to keep DI out here.

jnguyenx commented 7 years ago

I went ahead and extracted the modules from the core package, so you can use any DI framework in your service layer.

@julesjacobsen can you try to pull from the branch and use your spring stack? If it works for you then we can merge. I still lack of some sleep hours, so I want to be careful :-)

julesjacobsen commented 7 years ago

Yep, that works better. Although there are still some issues I've not got to the bottom of yet. I think these are more likely data-related.

Using OWLLoader and the test data from your library, the application starts-up fine

    @Bean
    public BMKnowledgeBase bmKnowledgeBase() throws Exception {
        OWLLoader owlLoader = new OWLLoader();
        owlLoader.load("src/main/resources/ontologies/hp.obo");
        owlLoader.loadDataFromTsvGzip("src/main/resources/data/human-pheno.assocs.gz");
        return owlLoader.createKnowledgeBaseInterface();
    }

but when I try to run a query which calls this code:

    ProfileMatcher phenodigmProfileMatcher = PhenodigmICProfileMatcher.create(bmKnowledgeBase);
    ProfileQuery query = ProfileQueryFactory.createQuery(ids);
    logger.info("Querying with Class ids: {}", query.getQueryClassIds());
    return phenodigmProfileMatcher.findMatchProfile(query);

it throws a NullPointerException:

2017-03-06 17:21:32.979  INFO 8744 --- [nio-8080-exec-1] o.m.controllers.MatchController          : Querying with Class ids: [HP:0001156, HP:0001363, HP:0011304, HP:0010055]
2017-03-06 17:21:32.990 ERROR 8744 --- [nio-8080-exec-1] o.a.c.c.C.[.[.[/].[dispatcherServlet]    : Servlet.service() for servlet [dispatcherServlet] in context with path [] threw exception [Request processing failed; nested exception is java.lang.NullPointerException] with root cause

java.lang.NullPointerException: null
    at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:210) ~[guava-18.0.jar:na]
    at org.monarchinitiative.owlsim.kb.impl.BMKnowledgeBaseOWLAPIImpl.getIndexForClassNode(BMKnowledgeBaseOWLAPIImpl.java:740) ~[owlsim-core-3.0-SNAPSHOT.jar:na]
    at org.monarchinitiative.owlsim.kb.impl.BMKnowledgeBaseOWLAPIImpl.getIndex(BMKnowledgeBaseOWLAPIImpl.java:625) ~[owlsim-core-3.0-SNAPSHOT.jar:na]
    at org.monarchinitiative.owlsim.kb.impl.BMKnowledgeBaseOWLAPIImpl.getClassIndex(BMKnowledgeBaseOWLAPIImpl.java:634) ~[owlsim-core-3.0-SNAPSHOT.jar:na]
    at org.monarchinitiative.owlsim.compute.matcher.impl.AbstractProfileMatcher.getProfileSetBM(AbstractProfileMatcher.java:94) ~[owlsim-core-3.0-SNAPSHOT.jar:na]
    at org.monarchinitiative.owlsim.compute.matcher.impl.PhenodigmICProfileMatcher.findMatchProfileImpl(PhenodigmICProfileMatcher.java:66) ~[owlsim-core-3.0-SNAPSHOT.jar:na]
    at org.monarchinitiative.owlsim.compute.matcher.impl.AbstractProfileMatcher.findMatchProfileAll(AbstractProfileMatcher.java:223) ~[owlsim-core-3.0-SNAPSHOT.jar:na]
    at org.monarchinitiative.owlsim.compute.matcher.impl.AbstractProfileMatcher.findMatchProfile(AbstractProfileMatcher.java:194) ~[owlsim-core-3.0-SNAPSHOT.jar:na]
    at org.monarchinitiative.controllers.MatchController.matchPhenodigm(MatchController.java:43) ~[classes/:na]
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.8.0_101]
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:1.8.0_101]
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_101]
    at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_101]
    at org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:205) ~[spring-web-4.3.6.RELEASE.jar:4.3.6.RELEASE]
    at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:133) ~[spring-web-4.3.6.RELEASE.jar:4.3.6.RELEASE]
    at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:116) ~[spring-webmvc-4.3.6.RELEASE.jar:4.3.6.RELEASE]
    at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:827) ~[spring-webmvc-4.3.6.RELEASE.jar:4.3.6.RELEASE]
    at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:738) ~[spring-webmvc-4.3.6.RELEASE.jar:4.3.6.RELEASE]
    at org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:85) ~[spring-webmvc-4.3.6.RELEASE.jar:4.3.6.RELEASE]
    at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:963) ~[spring-webmvc-4.3.6.RELEASE.jar:4.3.6.RELEASE]
    at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:897) ~[spring-webmvc-4.3.6.RELEASE.jar:4.3.6.RELEASE]
    at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:970) ~[spring-webmvc-4.3.6.RELEASE.jar:4.3.6.RELEASE]
    at org.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.java:861) ~[spring-webmvc-4.3.6.RELEASE.jar:4.3.6.RELEASE]

So I swapped the test data for the full-on all.owl:

    @Bean
    public BMKnowledgeBase bmKnowledgeBase() throws Exception {
        OWLLoader owlLoader = new OWLLoader();
        owlLoader.loadOWL("src/main/resources/data/all.owl");
        return owlLoader.createKnowledgeBaseInterface();
    }

    @Bean
    public ProfileMatcher phenodigmProfileMatcher(BMKnowledgeBase bmKnowledgeBase) {
        return PhenodigmICProfileMatcher.create(bmKnowledgeBase);
    }

This gave couple of warnings and some errors:

2017-03-06 17:11:05.103  WARN 14356 --- [           main] org.semanticweb.owlapi.util.SAXParsers   : http://www.oracle.com/xml/jaxp/properties/entityExpansionLimit not supported by parser type org.apache.xerces.jaxp.SAXParserImpl
2017-03-06 17:11:05.104  WARN 14356 --- [           main] org.semanticweb.owlapi.util.SAXParsers   : entityExpansionLimit not supported by parser type org.apache.xerces.jaxp.SAXParserImpl
2017-03-06 17:11:54.631 ERROR 14356 --- [           main] u.a.m.c.o.owlapi.OWLOntologyManagerImpl  : Illegal redeclarations of entities: reuse of entity http://purl.obolibrary.org/obo/BFO_0000063 in punning not allowed [Declaration(ObjectProperty(<http://purl.obolibrary.org/obo/BFO_0000063>)), Declaration(AnnotationProperty(<http://purl.obolibrary.org/obo/BFO_0000063>))]
2017-03-06 17:11:54.631 ERROR 14356 --- [           main] u.a.m.c.o.owlapi.OWLOntologyManagerImpl  : Illegal redeclarations of entities: reuse of entity http://purl.obolibrary.org/obo/RO_0002222 in punning not allowed [Declaration(AnnotationProperty(<http://purl.obolibrary.org/obo/RO_0002222>)), Declaration(ObjectProperty(<http://purl.obolibrary.org/obo/RO_0002222>))]

Then it all falls over as it can't create the BMKnowledgeBase. So I'm not sure what I'm doing wrong. Missing out the Curies? I think #57 would help in this regard if it held people's hand in getting the right information form the right places.

jnguyenx commented 7 years ago

@julesjacobsen I'm not 100% sure how this factory method should be used to be honest. Here's how I've bound the BMKonwledgeBase:

@Provides
BMKnowledgeBaseOWLAPIImpl provideBMKnowledgeBaseOWLAPIImpl(@IndicatesOwlOntologies OWLOntology owlOntology,
    @IndicatesOwlDataOntologies OWLOntology owlDataOntology, OWLReasonerFactory rf, CurieUtil curieUtil) {
  BMKnowledgeBaseOWLAPIImpl bMKnowledgeBaseOWLAPIImpl = new BMKnowledgeBaseOWLAPIImpl(owlOntology,
      owlDataOntology, rf, curieUtil);
  return bMKnowledgeBaseOWLAPIImpl;
}

And to get the ontology objects:

OWLOntology mergeOntologies(OWLOntologyManager manager, Collection<String> uris)
    throws OWLOntologyCreationException, FileNotFoundException, IOException {
  OWLOntology ontology = manager.createOntology();
  for (String uri : uris) {
    OWLOntology loadedOntology;
    if (uri.endsWith(".gz")) {
      GZIPInputStream gis = new GZIPInputStream(new FileInputStream(uri));
      BufferedReader bf = new BufferedReader(new InputStreamReader(gis, "UTF-8"));
      loadedOntology = manager.loadOntologyFromOntologyDocument(gis);
    } else {
      loadedOntology = loadOntology(manager, uri);
    }
    manager.addAxioms(ontology, loadedOntology.getAxioms());
  }
  return ontology;
}

@Provides
@IndicatesOwlOntologies
@Singleton
OWLOntology getOwlOntologies(OWLOntologyManager manager)
    throws OWLOntologyCreationException, FileNotFoundException, IOException {
  return mergeOntologies(manager, ontologyUris);
}

@Provides
@IndicatesOwlDataOntologies
@Singleton
OWLOntology getOwlDataOntologies(OWLOntologyManager manager)
    throws OWLOntologyCreationException, FileNotFoundException, IOException {
  return mergeOntologies(manager, ontologyDataUris);
}

@Provides
@IndicatesDataTsvs
@Singleton
OWLOntology getDataTsvs(OWLOntologyManager manager)
    throws OWLOntologyCreationException, FileNotFoundException, IOException {
  return mergeOntologies(manager, dataTsvs);
}

Hope it makes sense. For more insight @cmungall should be able to shed some lights.

julesjacobsen commented 7 years ago

I found the issue - it was to do with a lack of curies. Unfortunately the OWLLoader had no means of loading these in which was why it was failing.

I've implemented a new OwlKnowlegeBase class which is a bit of a mash-up of the OWLLoader and KnowledgeBaseModule which allows for fluent loading of a BMKnowledgeBase. For example

return OwlKnowledgeBase.loader()
                .loadCuries(curies())
                .loadOntology("src/main/resources/ontologies/hp.obo")
                .loadData("src/main/resources/data/human-pheno.assocs.gz")
                .createKnowledgeBase();

Where curies() defines a map {'HP' : 'http://purl.obolibrary.org/obo/HP_'} This partly addresses issue #57. It's not fully ready yet, but it works - the issue being that things need to be called in the correct order, which is clearly a bad thing.

jnguyenx commented 7 years ago

I agree that it's still kinda messy, the OWLLoader has a lot of duplicated function with the code I showed you in the previous comment.

So is it good for merging @julesjacobsen ? We can review the loader in a separate issue.

julesjacobsen commented 7 years ago

@jnguyenx from my perspective, yes, this is good to merge. I've got a fully working Spring-powered REST wrapper running the core code. I'll migrate my previous comment to where it really ought to reside...