vivo-project / VIVO

VIVO is an extensible semantic web application for research discovery and showcasing scholarly work
http://vivoweb.org
BSD 3-Clause "New" or "Revised" License
205 stars 128 forks source link

VIVO-1016: A Cornell instance runs out of memory on startup. #2700

Closed chenejac closed 6 years ago

chenejac commented 9 years ago

tlw72 (Migrated from VIVO-1016) said:

Jim,

Here's the trace from the log for that three-tier, VIVO Cornell issue.

Tim

2015-04-08 12:02:38,183 INFO [RDFFilesLoader] Loading rdf/auth/everytime/permission_config.n3 2015-04-08 12:03:45,669 ERROR [StartupManager] edu.cornell.mannlib.vitro.webapp.servlet.setup.ContentModelSetup@63302640 Threw unexpected error java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Arrays.java:2271) at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:113) at java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93) at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:140) at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221) at sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:282) at sun.nio.cs.StreamEncoder.write(StreamEncoder.java:125) at java.io.OutputStreamWriter.write(OutputStreamWriter.java:207) at java.io.BufferedWriter.flushBuffer(BufferedWriter.java:129) at java.io.BufferedWriter.write(BufferedWriter.java:143) at org.apache.jena.atlas.io.IndentedWriter.write$(IndentedWriter.java:165) at org.apache.jena.atlas.io.IndentedWriter.printOneChar(IndentedWriter.java:160) at org.apache.jena.atlas.io.IndentedWriter.print(IndentedWriter.java:100) at com.hp.hpl.jena.sparql.resultset.JSONOutputResultSet.printResource(JSONOutputResultSet.java:241) at com.hp.hpl.jena.sparql.resultset.JSONOutputResultSet.binding(JSONOutputResultSet.java:170) at com.hp.hpl.jena.sparql.resultset.ResultSetApply.apply(ResultSetApply.java:49) at com.hp.hpl.jena.sparql.resultset.JSONOutput.format(JSONOutput.java:36) at com.hp.hpl.jena.query.ResultSetFormatter.outputAsJSON(ResultSetFormatter.java:597) at edu.cornell.mannlib.vitro.webapp.rdfservice.impl.jena.RDFServiceJena.sparqlSelectQuery(RDFServiceJena.java:461) at edu.cornell.mannlib.vitro.webapp.rdfservice.impl.logging.LoggingRDFService.sparqlSelectQuery(LoggingRDFService.java:59) at edu.cornell.mannlib.vitro.webapp.dao.jena.RDFServiceGraph.execSelect(RDFServiceGraph.java:400) at edu.cornell.mannlib.vitro.webapp.dao.jena.RDFServiceGraph.find(RDFServiceGraph.java:257) at edu.cornell.mannlib.vitro.webapp.dao.jena.RDFServiceGraph.find(RDFServiceGraph.java:189) at com.hp.hpl.jena.graph.compose.MultiUnion.singleGraphFind(MultiUnion.java:151) at com.hp.hpl.jena.graph.compose.MultiUnion.graphBaseFind(MultiUnion.java:142) at com.hp.hpl.jena.graph.impl.GraphBase.find(GraphBase.java:268) at com.hp.hpl.jena.graph.GraphUtil.findAll(GraphUtil.java:128) at com.hp.hpl.jena.graph.impl.GraphBase.graphBaseSize(GraphBase.java:354) at com.hp.hpl.jena.graph.impl.GraphBase.size(GraphBase.java:344) at com.hp.hpl.jena.rdf.model.impl.ModelCom.size(ModelCom.java:918) at edu.cornell.mannlib.vitro.webapp.rdfservice.adapters.AbstractOntModelDecorator.size(AbstractOntModelDecorator.java:662) at edu.cornell.mannlib.vitro.webapp.servlet.setup.ContentModelSetup.setUpJenaDataSource(ContentModelSetup.java:62)

chenejac commented 9 years ago

Jim Blake said:

The problem arises from asking whether the ABox is empty. SDB translates this to a call to size(), which then becomes a call to find(null, null, null) and counting the results. This means reading the entire model into memory.

Take three approaches on this: 1) Don't ask whether the ABox is empty if you can avoid it. 2) If you can't avoid it, ask the graph, not the model, so you are going straight to RDFServiceGraph. 3) In RDFServiceGraph, handle a call to isEmpty() by a call to contains(null, null, null). Instead of using ASK ?s ?p ?o WHERE {?s ?p ?o}, which again becomes find(null, null, null), use SELECT * WHERE {?s ?p ?o} LIMIT 1, and test for a result.