Open lfearnley opened 1 year ago
Disabling JVM hotspot works to patch these out for GATK, but this can also be triggered by other some Java applications (such as picard
commands run in nf-raredisease) are also causing this behaviour. -XX:-UsePerfData
is stable in my experience across ~200 runs of Sarek.
ok, so picard
should be patched as well, I'll do that in a separate PR then...
It may also be an issue for fastqc
. It's happening to others so the patches are incredibly useful (https://github.com/nf-core/sarek/issues/1030), but I'm wondering if this is worth tagging with the nextflow devs as it seems to be a common issue.
Changed the name of the issue and kept it open, so that we can track other JAVA tools. all gatk4 modules have been patched (cf #3844), and we have a PR in sarek; https://github.com/nf-core/sarek/pull/1240
Great, thanks!
I'm trying out setting the _JAVA_OPTS environment variable for fastqc, which seems promising so far.
On Mon, 18 Sept 2023, 5:10 pm Maxime U Garcia, @.***> wrote:
Changed the name of the issue and kept it open, so that we can track other JAVA tools. all gatk4 modules have been patched (cf #3844 https://github.com/nf-core/modules/pull/3844), and we have a PR in sarek; nf-core/sarek#1240 https://github.com/nf-core/sarek/pull/1240
— Reply to this email directly, view it on GitHub https://github.com/nf-core/modules/issues/3455#issuecomment-1722860130, or unsubscribe https://github.com/notifications/unsubscribe-auth/AC25LCIPK2P7OLWUZEXQEPDX27XWVANCNFSM6AAAAAAYOBKJSU . You are receiving this because you authored the thread.Message ID: @.***>
I attempted to set up JAVA_TOOLS_OPTIONS and JAVA_OPTS in fgbio processes, but it did not resolve the issue. Fortunately, fgbio accepts direct parsing of-XX:-UsePerfData.
For completeness, you may need to set '_JAVA_OPTIONS' as well as 'JAVA_TOOLS_OPTIONS' and 'JAVA_OPTS'; https://stackoverflow.com/questions/28327620/difference-between-java-options-java-tool-options-and-java-opts has some more details on this.
Is your feature request related to a problem? Please describe
I've encountered a problem with the JVM Hotspot for GATK processes when multiple GATK processes are run on the same node in singularity containers (details in nf-sarek issue #1030). There's also a recent Sarek issue with SIGBUS errors related to Hotspot (nf-sarek issue #1024).
Describe the solution you'd like
I'd like to proposed turning HotSpot off using
-XX:-UsePerfData
in the--java-options
passed to GATK.This has two effects - it should eliminate a class of bugs related to the JVM and hsperfdata, as well as stabilising nf-core Singularity modules in rare and hard-to-debug situations.
Describe alternatives you've considered
Hotspot is hard-coded in the JVM to write files to /tmp. It ignores the --tmp-dir flag passed to GATK.
As far as I can tell turning this off has no negative side effects beyond preventing the use of jstat and certain Java debuggers which don't seem to be used in nf-core. This detailed blog post from Evan Jones describes an improvement to Java GC efficiency from turning this system off.
Alternatives would include preventing singularity from mounting host /tmp into the container (I'm not certain how this might be achieved within nf-core), or using
-XX:+PerfDisableSharedMem
.Additional context
I'm currently trialling nf-sarek with the
-XX:-UsePerfData
java option on ~100 human WGS and will update on stability.