I'm generating a large RDF using OpenRefine and the RDF-extension, and I'm getting an OutOfMemoryError. Looking at the full stacktrace (below) it seems to me that RdfExporter.buildModel is loading the whole graph in-memory; I'm not familiar with openRDF, so I'm asking: is it possible to change the exported to work in a stream-fashion? We don't really need to process the data twice, one to build the model and one to generate the triples, do we?
java.lang.OutOfMemoryError: Java heap space
at org.openrdf.sail.memory.model.MemStatementList.growArray(MemStatementList.java:143)
at org.openrdf.sail.memory.model.MemStatementList.add(MemStatementList.java:67)
at org.openrdf.sail.memory.MemoryStore.addStatement(MemoryStore.java:595)
at org.openrdf.sail.memory.MemoryStoreConnection.addStatementInternal(MemoryStoreConnection.java:418)
at org.openrdf.sail.memory.MemoryStoreConnection.addStatementInternal(MemoryStoreConnection.java:379)
at org.openrdf.sail.helpers.SailConnectionBase.addStatement(SailConnectionBase.java:331)
at org.openrdf.repository.sail.SailRepositoryConnection.addWithoutCommit(SailRepositoryConnection.java:236)
at org.openrdf.repository.base.RepositoryConnectionBase.addWithoutCommit(RepositoryConnectionBase.java:591)
at org.openrdf.repository.base.RepositoryConnectionBase.add(RepositoryConnectionBase.java:486)
at org.deri.grefine.rdf.ResourceNode.addLinks(ResourceNode.java:100)
at org.deri.grefine.rdf.ResourceNode.createNode(ResourceNode.java:119)
at org.deri.grefine.rdf.exporters.RdfExporter$1.visit(RdfExporter.java:110)
at com.google.refine.browsing.util.ConjunctiveFilteredRows.visitRow(ConjunctiveFilteredRows.java:76)
at com.google.refine.browsing.util.ConjunctiveFilteredRows.accept(ConjunctiveFilteredRows.java:65)
at org.deri.grefine.rdf.exporters.RdfExporter.buildModel(RdfExporter.java:123)
at org.deri.grefine.rdf.exporters.RdfExporter.buildModel(RdfExporter.java:115)
at org.deri.grefine.rdf.exporters.RdfExporter.export(RdfExporter.java:85)
at com.google.refine.commands.project.ExportRowsCommand.doPost(ExportRowsCommand.java:101)
at com.google.refine.RefineServlet.service(RefineServlet.java:177)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1166)
at org.mortbay.servlet.UserAgentFilter.doFilter(UserAgentFilter.java:81)
at org.mortbay.servlet.GzipFilter.doFilter(GzipFilter.java:155)
at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1157)
at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:388)
at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:418)
at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
at org.mortbay.jetty.Server.handle(Server.java:326)
I'm generating a large RDF using OpenRefine and the RDF-extension, and I'm getting an
OutOfMemoryError
. Looking at the full stacktrace (below) it seems to me thatRdfExporter.buildModel
is loading the whole graph in-memory; I'm not familiar with openRDF, so I'm asking: is it possible to change the exported to work in a stream-fashion? We don't really need to process the data twice, one to build the model and one to generate the triples, do we?