Closed chenejac closed 6 years ago
Christopher Barnes said:
I got a notification of activity. Does the note mean that this fix will come in v 1.7? Thx - Chris
Christopher Barnes said:
Attached a Screen Shot of the settings in EXPORT RDF that fails to complete when trying to export.
Jim Blake said:
It means that it's going into the wish list for 1.7. From there, only time will tell...
Jim Blake said:
Able to reproduce this error on my laptop with Weill Cornell data.
==> catalina.out <==
Exception in thread "ajp-bio-4009-AsyncTimeout" java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.util.concurrent.ConcurrentLinkedQueue.iterator(ConcurrentLinkedQueue.java:667)
at org.apache.tomcat.util.net.JIoEndpoint$AsyncTimeout.run(JIoEndpoint.java:156)
at java.lang.Thread.run(Thread.java:744)
Exception in thread "http-bio-4080-exec-10" java.lang.OutOfMemoryError: GC overhead limit exceeded
at com.mysql.jdbc.Buffer.
Jim Blake said:
From the preceding stack trace:
It's dying here: RDFServiceJena.java:310, and with good reason. It's trying to read the entire ABOX into memory.
Jim Blake said:
The plan: to avoid holding the entire data model in memory, use a SELECT query instead of a CONSTRUCT query, and re-format the results in a stream as they arrive. Only offer exports as N-triples, since other formats handle multiple triples at once, and that defeats the stream approach.
Offer the choice of named graphs, with clarifying labels for those graphs that we recognize: "Inferred ABOX", "Declared TBOX", etc. Can we recognize the ontologies by their graph names?
Jim Blake said:
Questions from existing code: Why is the extension .owl used for RDF/XML from the TBox, but .rdf is used for RDF/XML otherwise?
Jim Blake (Migrated from VIVO-719) said:
Chris Barnes is trying to produce a large sample data set, starting by exporting the UF data. When he tries to export the RDF, it aborts.