Tuks-ICMM / Pharmacogenetic-Analysis-Pipeline

A Snakemake powered pipeline developed to perform variant-effect-prediction and frequency analysis given multiple Variant Call Format datasets. This has been developed in partial fulfilment of a MSc in Bioinformatics at the University of Pretoria by Graeme Ford.
https://tuks-icmm.github.io/Pharmacogenetic-Analysis-Pipeline/
Creative Commons Attribution 4.0 International
0 stars 0 forks source link

[BUG] | Java `java.lang.OutOfMemoryError: Java heap space` #16

Closed G-kodes closed 2 years ago

G-kodes commented 2 years ago

Describe the bug The current implementation of Picard and GATK products uses Java on our cluster without an explicit Java Heap Limit set. This means that Java will try to automatically detect and use however much is available. This is not reliable as Java often does not see how much memory is actually available, especially in the case of the UP Bioinformatics PBS/Torque clustered environment.

Possible cause Lack of -Xmx declaration

Additional context One issue in implementation is the dynamic nature of the pipeline. The idea at the end of all this is that users should be able to control PBS/Torque resource allocation from the config file (Not currently implemented). As such, the -Xmx declaration should dynamically pull the available memory for the selected queue as the designated Max Heap Size. This is a 'nice-to-have variation to this deliverable.

Credits: Credit for raising this issue goes to @Megs47 and @sarahsaraht. Thanks Guys!

G-kodes commented 2 years ago

A Java-based CLI declaration using the -Xmx flag to explicitly set the memory heap size per execution has been implemented. This memory value has been pulled dynamically based on rule queue setup from the config file, making it customisable, should the end-user have different queue setups and restrictions to our UP Bioinformatics Cluster