geneontology / obographs

Basic and Advanced OBO Graphs: specification and reference implementation
63 stars 12 forks source link

serializing to JSON leads to OutOfMemory exception #43

Closed saramsey closed 1 year ago

saramsey commented 5 years ago

Hi, I have a 5.5 GB OWL file "umls.owl". When I run

robot convert --input umls.owl --output umls.json

on a machine with 392 GB of RAM (and with '-Xmx390G' passed to java), I get the following exception:

Exception in thread "main" java.lang.OutOfMemoryError
        at java.base/java.lang.AbstractStringBuilder.hugeCapacity(AbstractStringBuilder.java:188)
        at java.base/java.lang.AbstractStringBuilder.newCapacity(AbstractStringBuilder.java:180)
        at java.base/java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:147)
        at java.base/java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:660)
        at java.base/java.lang.StringBuilder.append(StringBuilder.java:195)
        at com.fasterxml.jackson.core.util.TextBuffer.contentsAsString(TextBuffer.java:399)
        at com.fasterxml.jackson.core.io.SegmentedStringWriter.getAndClear(SegmentedStringWriter.java:83)
        at com.fasterxml.jackson.databind.ObjectWriter.writeValueAsString(ObjectWriter.java:1037)
        at org.geneontology.obographs.io.OgJsonGenerator.prettyJsonString(OgJsonGenerator.java:18)
        at org.geneontology.obographs.io.OgJsonGenerator.render(OgJsonGenerator.java:11)
        at org.obolibrary.robot.IOHelper.saveOntologyFile(IOHelper.java:1122)
        at org.obolibrary.robot.IOHelper.saveOntology(IOHelper.java:583)
        at org.obolibrary.robot.IOHelper.saveOntology(IOHelper.java:524)
        at org.obolibrary.robot.ConvertCommand.execute(ConvertCommand.java:167)
        at org.obolibrary.robot.CommandManager.executeCommand(CommandManager.java:248)
        at org.obolibrary.robot.CommandManager.execute(CommandManager.java:192)
        at org.obolibrary.robot.CommandManager.main(CommandManager.java:139)
        at org.obolibrary.robot.CommandLineInterface.main(CommandLineInterface.java:55)

I wonder if the problem is that OgJsonGenerator.prettyJsonString is calling mapper.writerWithDefaultPrettyPrinter which may be trying to return a massive String object that is bigger than Java can handle (maximum size of 2147483647 bytes, I believe). Is there a way to circumvent this issue so robot can write out larger JSON files using the obographs package?

saramsey commented 5 years ago

Would using ObjectWriter.writeValue(output_stream, object_value) make sense?

https://stackoverflow.com/questions/34657372/java-lang-outofmemoryerror-java-heap-space-with-json-conversion

Forgive me if this doesn't make sense; my Java is pretty rusty.

julesjacobsen commented 1 year ago

I think this has been fixed with the addition of these methods:

https://github.com/geneontology/obographs/blob/f10fa4324650fdcd05f65a10a1993938557903a5/obographs-core/src/main/java/org/geneontology/obographs/core/io/OgJsonGenerator.java#L25-L38

Closing this now. Please re-open if it's not fixed.