Closed jlboat closed 4 years ago
Hi!
Thank you for the detailed feedback - it's nice to see the kinds of issues that come up when running Jasmine on such large datasets! I just pushed a change which converts the KD-tree building to be entirely non-recursive (maintaining a version of the "call stack" on the heap instead), so that should resolve the issues you were encountering there.
Thanks! Melanie
Hello,
When merging 23 VCFs with about 10,000 CNVs each, I get:
Merging graph ID: Chr01DEL Exception in thread "main" java.lang.StackOverflowError at KDTree.build(KDTree.java:54) at KDTree.build(KDTree.java:55) # this one repeats a bunch
The recursion on the K-D tree doesn't seem to work with 23 samples and default values, if there are a lot of CNVs. Running fewer samples resolves the issue, and bumping the stack size up also resolves the overflow (-Xss1G). But you'll probably want to either document this issue or recode the recursion to handle the data differently. Note: I also ran a test with nearly 300 samples at about 10k CNVs each (-Xmx60G -Xss1G threads=16) and that appeared to work properly. So, it can be circumvented. The SUPP_VEC_EXT and SUPP_VEC values are crazy high, though.