chenejac / VIVOTestMigrationJIRAClosed

0 stars 0 forks source link

VIVO-1016: A Cornell instance runs out of memory on startup. #496

Closed chenejac closed 6 years ago

chenejac commented 9 years ago

tlw72 (Migrated from VIVO-1016) said:

Jim,

Here's the trace from the log for that three-tier, VIVO Cornell issue.

Tim

2015-04-08 12:02:38,183 INFO [RDFFilesLoader] Loading rdf/auth/everytime/permission_config.n3 2015-04-08 12:03:45,669 ERROR [StartupManager] edu.cornell.mannlib.vitro.webapp.servlet.setup.ContentModelSetup@63302640 Threw unexpected error java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Arrays.java:2271) at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:113) at java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93) at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:140) at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221) at sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:282) at sun.nio.cs.StreamEncoder.write(StreamEncoder.java:125) at java.io.OutputStreamWriter.write(OutputStreamWriter.java:207) at java.io.BufferedWriter.flushBuffer(BufferedWriter.java:129) at java.io.BufferedWriter.write(BufferedWriter.java:143) at org.apache.jena.atlas.io.IndentedWriter.write$(IndentedWriter.java:165) at org.apache.jena.atlas.io.IndentedWriter.printOneChar(IndentedWriter.java:160) at org.apache.jena.atlas.io.IndentedWriter.print(IndentedWriter.java:100) at com.hp.hpl.jena.sparql.resultset.JSONOutputResultSet.printResource(JSONOutputResultSet.java:241) at com.hp.hpl.jena.sparql.resultset.JSONOutputResultSet.binding(JSONOutputResultSet.java:170) at com.hp.hpl.jena.sparql.resultset.ResultSetApply.apply(ResultSetApply.java:49) at com.hp.hpl.jena.sparql.resultset.JSONOutput.format(JSONOutput.java:36) at com.hp.hpl.jena.query.ResultSetFormatter.outputAsJSON(ResultSetFormatter.java:597) at edu.cornell.mannlib.vitro.webapp.rdfservice.impl.jena.RDFServiceJena.sparqlSelectQuery(RDFServiceJena.java:461) at edu.cornell.mannlib.vitro.webapp.rdfservice.impl.logging.LoggingRDFService.sparqlSelectQuery(LoggingRDFService.java:59) at edu.cornell.mannlib.vitro.webapp.dao.jena.RDFServiceGraph.execSelect(RDFServiceGraph.java:400) at edu.cornell.mannlib.vitro.webapp.dao.jena.RDFServiceGraph.find(RDFServiceGraph.java:257) at edu.cornell.mannlib.vitro.webapp.dao.jena.RDFServiceGraph.find(RDFServiceGraph.java:189) at com.hp.hpl.jena.graph.compose.MultiUnion.singleGraphFind(MultiUnion.java:151) at com.hp.hpl.jena.graph.compose.MultiUnion.graphBaseFind(MultiUnion.java:142) at com.hp.hpl.jena.graph.impl.GraphBase.find(GraphBase.java:268) at com.hp.hpl.jena.graph.GraphUtil.findAll(GraphUtil.java:128) at com.hp.hpl.jena.graph.impl.GraphBase.graphBaseSize(GraphBase.java:354) at com.hp.hpl.jena.graph.impl.GraphBase.size(GraphBase.java:344) at com.hp.hpl.jena.rdf.model.impl.ModelCom.size(ModelCom.java:918) at edu.cornell.mannlib.vitro.webapp.rdfservice.adapters.AbstractOntModelDecorator.size(AbstractOntModelDecorator.java:662) at edu.cornell.mannlib.vitro.webapp.servlet.setup.ContentModelSetup.setUpJenaDataSource(ContentModelSetup.java:62)

chenejac commented 9 years ago

Jim Blake said:

The problem arises from asking whether the ABox is empty. SDB translates this to a call to size(), which then becomes a call to find(null, null, null) and counting the results. This means reading the entire model into memory.

Take three approaches on this: 1) Don't ask whether the ABox is empty if you can avoid it. 2) If you can't avoid it, ask the graph, not the model, so you are going straight to RDFServiceGraph. 3) In RDFServiceGraph, handle a call to isEmpty() by a call to contains(null, null, null). Instead of using ASK ?s ?p ?o WHERE {?s ?p ?o}, which again becomes find(null, null, null), use SELECT * WHERE {?s ?p ?o} LIMIT 1, and test for a result.