OutOfMemoryError when attempting sort edn logs

The import process I was running on the AWS dev machine died with OutOfMemoryError.

Cause

The change made to sorting files in Clojure rather than shelling out
Niave usage of slurp and clojure.string/split-lines

Interestingly, this only happens on the AWS machine using the OpenJDK. The same code running on an EBI server with Oracle Java does not error.

Tracback

Caused by: java.lang.OutOfMemoryError: Requested array size exceeds VM limit

Suggested resolution

I modified the code to use the builtin clojure.core/line-seq function, which is more efficient in that it incrementally reads lines from a stream (as opposed to with slurp + clojure.string/split-lines) but it still errors out, this time with: java.lang.OutOfMemoryError: Java heap space

I suggest we make the change suggested above, and use the Oracle JVM.

WormBase / pseudoace