Open emetsger opened 6 years ago
There seems to be more to it than this one problem. I attempted to update to both 2.2.4 and 2.4.2. As part of the updates. For 2.2.4, I circumvented the issue described above by adding a SPARQL query to check for a context. (see my branch here). The tests in that version all pass, but when deployed, the first DiSCO to be posted through the API after deploying a local instance often does not get created properly in the triplestore, though it will sometimes get indexed. While navigating, you will run into a mixture of errors at different points.
Most common:
java.lang.NullPointerException
[INFO] [talledLocalContainer] at info.rmapproject.core.rmapservice.impl.rdf4j.ORMapQueriesLineage.getLineageMembersWithDates(ORMapQueriesLineage.java:139) ~[rmap-core-2.0.1-beta-SNAPSHOT.jar:?]
[INFO] [talledLocalContainer] at info.rmapproject.core.rmapservice.impl.rdf4j.ORMapQueriesLineage.getLineageMembers(ORMapQueriesLineage.java:158) ~[rmap-core-2.0.1-beta-SNAPSHOT.jar:?]
[INFO] [talledLocalContainer] at info.rmapproject.core.rmapservice.impl.rdf4j.ORMapService.getDiSCODVersionsAndDerivatives(ORMapService.java:543) ~[rmap-core-2.0.1-beta-SNAPSHOT.jar:?]
[INFO] [talledLocalContainer] at info.rmapproject.webapp.service.DataDisplayServiceImpl.getDiSCODTO(DataDisplayServiceImpl.java:173) ~[classes/:?]
[INFO] [talledLocalContainer] at info.rmapproject.webapp.controllers.DiSCODataController.discoTableData(DiSCODataController.java:202) ~[classes/:?]
[INFO] [talledLocalContainer] ... 42 more
Further inspection finds progenitor
is null.
Here is a rarer one:
[INFO] [talledLocalContainer] Caused by: org.apache.http.impl.conn.ConnectionShutdownException
[INFO] [talledLocalContainer] at org.apache.http.impl.conn.CPoolProxy.getValidConnection(CPoolProxy.java:79) ~[httpclient-4.5.2.jar:4.5.2]
[INFO] [talledLocalContainer] at org.apache.http.impl.conn.CPoolProxy.getSSLSession(CPoolProxy.java:147) ~[httpclient-4.5.2.jar:4.5.2]
[INFO] [talledLocalContainer] at org.apache.http.impl.client.DefaultUserTokenHandler.getUserToken(DefaultUserTokenHandler.java:81) ~[httpclient-4.5.2.jar:4.5.2]
[INFO] [talledLocalContainer] at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:327) ~[httpclient-4.5.2.jar:4.5.2]
[INFO] [talledLocalContainer] at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:184) ~[httpclient-4.5.2.jar:4.5.2]
[INFO] [talledLocalContainer] at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110) ~[httpclient-4.5.2.jar:4.5.2]
[INFO] [talledLocalContainer] at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:184) ~[httpclient-4.5.2.jar:4.5.2]
[INFO] [talledLocalContainer] at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82) ~[httpclient-4.5.2.jar:4.5.2]
[INFO] [talledLocalContainer] at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:55) ~[httpclient-4.5.2.jar:4.5.2]
[INFO] [talledLocalContainer] at org.eclipse.rdf4j.http.client.SPARQLProtocolSession.execute(SPARQLProtocolSession.java:1113) ~[rdf4j-http-client-2.2.2.jar:?]
[INFO] [talledLocalContainer] at org.eclipse.rdf4j.http.client.SPARQLProtocolSession.executeOK(SPARQLProtocolSession.java:1066) ~[rdf4j-http-client-2.2.2.jar:?]
[INFO] [talledLocalContainer] at org.eclipse.rdf4j.http.client.SPARQLProtocolSession.sendGraphQueryViaHttp(SPARQLProtocolSession.java:972) ~[rdf4j-http-client-2.2.2.jar:?]
[INFO] [talledLocalContainer] at org.eclipse.rdf4j.http.client.SPARQLProtocolSession.getRDF(SPARQLProtocolSession.java:938) ~[rdf4j-http-client-2.2.2.jar:?]
[INFO] [talledLocalContainer] at org.eclipse.rdf4j.http.client.RDF4JProtocolSession.getStatements(RDF4JProtocolSession.java:492) ~[rdf4j-http-client-2.2.2.jar:?]
[INFO] [talledLocalContainer] at org.eclipse.rdf4j.repository.http.HTTPRepositoryConnection.exportStatements(HTTPRepositoryConnection.java:284) ~[rdf4j-repository-http-2.2.2.jar:?]
[INFO] [talledLocalContainer] at org.eclipse.rdf4j.repository.http.HTTPRepositoryConnection.getStatements(HTTPRepositoryConnection.java:269) ~[rdf4j-repository-http-2.2.2.jar:?]
[INFO] [talledLocalContainer] at info.rmapproject.core.rmapservice.impl.rdf4j.triplestore.Rdf4jTriplestore.getStatements(Rdf4jTriplestore.java:231) ~[rmap-core-2.0.1-beta-SNAPSHOT.jar:?]
[INFO] [talledLocalContainer] at info.rmapproject.core.rmapservice.impl.rdf4j.triplestore.Rdf4jTriplestore.getStatements(Rdf4jTriplestore.java:255) ~[rmap-core-2.0.1-beta-SNAPSHOT.jar:?]
[INFO] [talledLocalContainer] at info.rmapproject.core.rmapservice.impl.rdf4j.ORMapResourceMgr.getResourceRdfTypes(ORMapResourceMgr.java:443) ~[rmap-core-2.0.1-beta-SNAPSHOT.jar:?]
[INFO] [talledLocalContainer] at info.rmapproject.core.rmapservice.impl.rdf4j.ORMapService.getResourceRdfTypesInContext(ORMapService.java:279) ~[rmap-core-2.0.1-beta-SNAPSHOT.jar:?]
[INFO] [talledLocalContainer] at info.rmapproject.webapp.service.DataDisplayServiceImpl.getDiSCOGraph(DataDisplayServiceImpl.java:202) ~[classes/:?]
[INFO] [talledLocalContainer] at info.rmapproject.webapp.controllers.DiSCODataController.discoGraphData(DiSCODataController.java:164) ~[classes/:?]
[INFO] [talledLocalContainer] ... 42 more
Using 2.4.2, it has these problems, plus it hangs without error after a few minutes of browsing the webapp so that you can no longer navigate DiSCOs. Often the last activity it does is attempt to retrieve the DiSCO versions. The way the UI is designed, it does 2 similar requests in parallel which may be causing a problem (sometimes you will see the 2 version requests fire off in tandem before it hangs).
Because of the tandem firing of requests, it suggests it might be thread related, but it's not clear why the new rdf4j would cause or maybe just highlight some existing issues. I note that lineage methods are the only static methods in RMapService, not sure if that could be related as well.
PR #255 also has a little more information about this issue.
A few thoughts related to this that I wanted to capture: (1) I think it might be helpful to look into a refactor of the rdf4j connection slice of RMap. The way it was done originally is less than ideal with RMap adding an unnecessary layer to the connection object in places. Since RMap was created, rdf4j has changed quite a lot. If we take advantage of the changes to connection object some of the issues might go away. (2) On the back burner, when I have time between things, I've been working to refactor out unnecessary rdf4j-specific logic in anticipation that we may want to move away from rdf4j as the sole option in the future. For example, there is no reason to use rdf4j in model objects or vocabularies. (3) GraphDB has many more release versions since the one we are running. Perhaps the connection issues that we are seeing in the live environment but not the local environment could be resolved by updating GraphDB
Upgrading to RDF4J 2.2.4 (from 2.2.2) produces a number of test failures. At least one of the IT failures has to do with calculating the number of statements present in an RDF4J context.
An example failure:
The source of this issue is in
ORMapObjectMgr
:Specifically here:
With 2.2.4
ts.getConnection().size(id)
returns0
, (whereid
is considered to be the context (i.e. graph uri)). With 2.2.2, this method behaves as expected, returning some positive integer for the number of statements present in the triplestore for the graph. Whents.getConnection().size(id)
returns0
, the methodisRMapType(Rdf4jTriplestore ts, IRI id, IRI typeIRI)
returnsfalse
, ultimately resulting in theinfo.rmapproject.core.exception.RMapDiSCONotFoundException: No DiSCO with id rmap:rmd18m7mr7
exception.