Open jupollet opened 4 years ago
@jupollet I'm unable to reproduce this bug. Could you provide more details?
What kind of other information? I lunch this command in the GenOuest Cluster inside a bioconda environment containing Version:4.1.4.0 of GATK and other tools. Bam were generated by bwa mem and sorted by samtools, read group were added with gatk AddOrReplaceReadGroups and duplicates remove by gatk MarkDuplicates, indexing new bam.
It might be something that has been discussed in this thread https://gatkforums.broadinstitute.org/gatk/discussion/24595/mutect2-in-gatk-4-1-4-not-producing-stats-file
They discuss that add more ram. I try with 8 cpu and 500 Go of RAM, but still not working.
Error for one bam file:
15:47:36.554 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/home/genouest/uni_limoges_fr/jpollet/.conda/envs/myd88/share/gatk4-4.1.4.0-1/gatk-package-4.1.4.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
Nov 28, 2019 3:47:37 PM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
15:47:37.239 INFO Mutect2 - ------------------------------------------------------------
15:47:37.240 INFO Mutect2 - The Genome Analysis Toolkit (GATK) v4.1.4.0
15:47:37.240 INFO Mutect2 - For support and documentation go to https://software.broadinstitute.org/gatk/
15:47:37.240 INFO Mutect2 - Executing as jpollet@cl1n031.genouest.org on Linux v3.10.0-693.21.1.el7.x86_64 amd64
15:47:37.240 INFO Mutect2 - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_192-b01
15:47:37.246 INFO Mutect2 - Start Date/Time: 28 novembre 2019 15:47:36 CET
15:47:37.246 INFO Mutect2 - ------------------------------------------------------------
15:47:37.246 INFO Mutect2 - ------------------------------------------------------------
15:47:37.246 INFO Mutect2 - HTSJDK Version: 2.20.3
15:47:37.246 INFO Mutect2 - Picard Version: 2.21.1
15:47:37.247 INFO Mutect2 - HTSJDK Defaults.COMPRESSION_LEVEL : 2
15:47:37.247 INFO Mutect2 - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
15:47:37.247 INFO Mutect2 - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
15:47:37.247 INFO Mutect2 - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
15:47:37.247 INFO Mutect2 - Deflater: IntelDeflater
15:47:37.247 INFO Mutect2 - Inflater: IntelInflater
15:47:37.247 INFO Mutect2 - GCS max retries/reopens: 20
15:47:37.247 INFO Mutect2 - Requester pays: disabled
15:47:37.247 INFO Mutect2 - Initializing engine
15:47:41.204 INFO Mutect2 - Done initializing engine
15:47:42.352 INFO NativeLibraryLoader - Loading libgkl_utils.so from jar:file:/home/genouest/uni_limoges_fr/jpollet/.conda/envs/myd88/share/gatk4-4.1.4.0-1/gatk-package-4.1.4.0-local.jar!/com/intel/gkl/native/libgkl_utils.so
15:47:42.423 INFO NativeLibraryLoader - Loading libgkl_pairhmm_omp.so from jar:file:/home/genouest/uni_limoges_fr/jpollet/.conda/envs/myd88/share/gatk4-4.1.4.0-1/gatk-package-4.1.4.0-local.jar!/com/intel/gkl/native/libgkl_pairhmm_omp.so
15:47:42.482 INFO IntelPairHmm - Flush-to-zero (FTZ) is enabled when running PairHMM
15:47:42.483 INFO IntelPairHmm - Available threads: 8
15:47:42.483 INFO IntelPairHmm - Requested threads: 4
15:47:42.483 INFO PairHMM - Using the OpenMP multi-threaded AVX-accelerated native PairHMM implementation
15:47:42.936 INFO ProgressMeter - Starting traversal
15:47:42.936 INFO ProgressMeter - Current Locus Elapsed Minutes Regions Processed Regions/Minute
15:47:53.565 INFO ProgressMeter - ENA|LVXK01000001|LVXK01000001.1:19555 0.2 90 508.0
15:48:05.962 INFO ProgressMeter - ENA|LVXK01000001|LVXK01000001.1:136820 0.4 600 1563.5
15:48:16.023 INFO ProgressMeter - ENA|LVXK01000001|LVXK01000001.1:360783 0.6 1560 2828.9
15:48:19.342 INFO VectorLoglessPairHMM - Time spent in setup for JNI call : 0.010346494000000001
15:48:19.342 INFO PairHMM - Total compute time in PairHMM computeLogLikelihoods() : 6.453042841
15:48:19.347 INFO SmithWatermanAligner - Total compute time in java Smith-Waterman : 10.39 sec
15:48:19.348 INFO Mutect2 - Shutting down engine
[28 novembre 2019 15:48:19 CET] org.broadinstitute.hellbender.tools.walkers.mutect.Mutect2 done. Elapsed time: 0.72 minutes.
Runtime.totalMemory()=3822583808
java.lang.IllegalArgumentException: Cannot construct fragment from more than two reads
at org.broadinstitute.hellbender.utils.Utils.validateArg(Utils.java:725)
at org.broadinstitute.hellbender.utils.read.Fragment.create(Fragment.java:36)
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
at org.broadinstitute.hellbender.utils.genotyper.AlleleLikelihoods.groupEvidence(AlleleLikelihoods.java:595)
at org.broadinstitute.hellbender.tools.walkers.mutect.SomaticGenotypingEngine.callMutations(SomaticGenotypingEngine.java:93)
at org.broadinstitute.hellbender.tools.walkers.mutect.Mutect2Engine.callRegion(Mutect2Engine.java:251)
at org.broadinstitute.hellbender.tools.walkers.mutect.Mutect2.apply(Mutect2.java:320)
at org.broadinstitute.hellbender.engine.AssemblyRegionWalker.processReadShard(AssemblyRegionWalker.java:308)
at org.broadinstitute.hellbender.engine.AssemblyRegionWalker.traverse(AssemblyRegionWalker.java:281)
at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1048)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:139)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:191)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:210)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:163)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:206)
at org.broadinstitute.hellbender.Main.main(Main.java:292)
/home/genouest/uni_limoges_fr/jpollet/.conda/envs/myd88/bin/gatk:80: SyntaxWarning: "is" with a literal. Did you mean "=="?
if len(args) is 0 or (len(args) is 1 and (args[0] == "--help" or args[0] == "-h")):
/home/genouest/uni_limoges_fr/jpollet/.conda/envs/myd88/bin/gatk:80: SyntaxWarning: "is" with a literal. Did you mean "=="?
if len(args) is 0 or (len(args) is 1 and (args[0] == "--help" or args[0] == "-h")):
/home/genouest/uni_limoges_fr/jpollet/.conda/envs/myd88/bin/gatk:117: SyntaxWarning: "is" with a literal. Did you mean "=="?
if len(args) is 1 and args[0] == "--list":
/home/genouest/uni_limoges_fr/jpollet/.conda/envs/myd88/bin/gatk:308: SyntaxWarning: "is" with a literal. Did you mean "=="?
if call(["gsutil", "-q", "stat", gcsjar]) is 0:
/home/genouest/uni_limoges_fr/jpollet/.conda/envs/myd88/bin/gatk:312: SyntaxWarning: "is" with a literal. Did you mean "=="?
if call(["gsutil", "cp", jar, gcsjar]) is 0:
/home/genouest/uni_limoges_fr/jpollet/.conda/envs/myd88/bin/gatk:467: SyntaxWarning: "is not" with a literal. Did you mean "!="?
if not len(properties) is 0:
/home/genouest/uni_limoges_fr/jpollet/.conda/envs/myd88/bin/gatk:471: SyntaxWarning: "is not" with a literal. Did you mean "!="?
if not len(filesToAdd) is 0:
Using GATK jar /home/genouest/uni_limoges_fr/jpollet/.conda/envs/myd88/share/gatk4-4.1.4.0-1/gatk-package-4.1.4.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /home/genouest/uni_limoges_fr/jpollet/.conda/envs/myd88/share/gatk4-4.1.4.0-1/gatk-package-4.1.4.0-local.jar Mutect2 -R /omaha-beach/jpollet/MYD88/data/ref/BALBcJ.fasta -I /omaha-beach/jpollet/MYD88/result/valide_3060_R1vsBALBcJ.sorted.md.bam -O /omaha-beach/jpollet/MYD88/result/valide_3060_R1vsBALBcJ.sortedunf.vcf
15:47:36.551 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/home/genouest/uni_limoges_fr/jpollet/.conda/envs/myd88/share/gatk4-4.1.4.0-1/gatk-package-4.1.4.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
I verify with sacct SLURM command and the job have no problem with RAM memory, he run through the end but no produce .stat file and output only .vcf and .vcf.idx
Jupollet,
This is a known issue and should be resolved by the most recent release gatk4-4.1.4.1. This was released last week, so you may need to just update.
If that doesn’t work, you may need to disable supplementary reads.
Thanks,
Mark
On Mon, Dec 2, 2019 at 10:52 AM jupollet notifications@github.com wrote:
I verify with sacct SLURM command and the job have no problem with RAM memory, he run through the end but no produce .stat file and output only .vcf and .vcf.idx
— You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub https://github.com/broadinstitute/gatk/issues/6271?email_source=notifications&email_token=ACRX2DIR7ZYRDCNPZNOLET3QWU4L3A5CNFSM4JPWZLUKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFUEKKA#issuecomment-560481576, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACRX2DJECNGRYUGBY62LLN3QWU4L3ANCNFSM4JPWZLUA .
Hi, I update GATK today. After 158 minutes variant calling on the same bam files, I have another issue :
[3 décembre 2019 13:57:42 CET] org.broadinstitute.hellbender.tools.walkers.mutect.Mutect2 done. Elapsed time: 158.34 minutes.
Runtime.totalMemory()=28647096320
Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.util.LinkedHashMap$LinkedKeySet.iterator(LinkedHashMap.java:543)
at java.util.HashSet.iterator(HashSet.java:173)
at java.util.AbstractCollection.toArray(AbstractCollection.java:137)
at java.util.LinkedList.addAll(LinkedList.java:408)
at java.util.LinkedList.addAll(LinkedList.java:387)
at org.broadinstitute.hellbender.tools.walkers.haplotypecaller.graphs.BaseGraph$BaseGraphIterator.next(BaseGraph.java:774)
at org.broadinstitute.hellbender.tools.walkers.haplotypecaller.graphs.BaseGraph$BaseGraphIterator.next(BaseGraph.java:723)
at org.broadinstitute.hellbender.tools.walkers.haplotypecaller.graphs.BaseGraph.removePathsNotConnectedToRef(BaseGraph.java:505)
at org.broadinstitute.hellbender.tools.walkers.haplotypecaller.readthreading.ReadThreadingAssembler.getAssemblyResult(ReadThreadingAssembler.java:514)
at org.broadinstitute.hellbender.tools.walkers.haplotypecaller.readthreading.ReadThreadingAssembler.createGraph(ReadThreadingAssembler.java:492)
at org.broadinstitute.hellbender.tools.walkers.haplotypecaller.readthreading.ReadThreadingAssembler.assemble(ReadThreadingAssembler.java:401)
at org.broadinstitute.hellbender.tools.walkers.haplotypecaller.readthreading.ReadThreadingAssembler.runLocalAssembly(ReadThreadingAssembler.java:148)
at org.broadinstitute.hellbender.tools.walkers.haplotypecaller.AssemblyBasedCallerUtils.assembleReads(AssemblyBasedCallerUtils.java:290)
at org.broadinstitute.hellbender.tools.walkers.mutect.Mutect2Engine.callRegion(Mutect2Engine.java:224)
at org.broadinstitute.hellbender.tools.walkers.mutect.Mutect2.apply(Mutect2.java:320)
at org.broadinstitute.hellbender.engine.AssemblyRegionWalker.processReadShard(AssemblyRegionWalker.java:308)
at org.broadinstitute.hellbender.engine.AssemblyRegionWalker.traverse(AssemblyRegionWalker.java:281)
at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1048)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:139)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:191)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:210)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:163)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:206)
at org.broadinstitute.hellbender.Main.main(Main.java:292)
@jupollet What is the read depth like for this bam? Is it human tumor-only calling?
Jupollet, This is a known issue and should be resolved by the most recent release gatk4-4.1.4.1. This was released last week, so you may need to just update. If that doesn’t work, you may need to disable supplementary reads. Thanks, Mark … On Mon, Dec 2, 2019 at 10:52 AM jupollet @.***> wrote: I verify with sacct SLURM command and the job have no problem with RAM memory, he run through the end but no produce .stat file and output only .vcf and .vcf.idx — You are receiving this because you were assigned. Reply to this email directly, view it on GitHub <#6271?email_source=notifications&email_token=ACRX2DIR7ZYRDCNPZNOLET3QWU4L3A5CNFSM4JPWZLUKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFUEKKA#issuecomment-560481576>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACRX2DJECNGRYUGBY62LLN3QWU4L3ANCNFSM4JPWZLUA .
Dear @fleharty
I have 4.2.2. version of GATK, however the problem is exactly like that, I get vcf and its index file without stats:
A USER ERROR has occurred: Mutect stats table somatic_449_WT_vs_6KO_Pd.vcf.gz.stats not found. When Mutect2 outputs a file calls.vcf it also creates a calls.vcf.stats file. Perhaps this file was not moved along with the vcf, or perhaps it was not delocalized from a virtual machine while running in the cloud.
Is there any update on that? Got the same issue on 4.4.0.0
@riasc I got the same issue on 4.4.0.0 and found that it was due to running via slurm.
Not sure why, but when running from the headnode it works, but running via sbatch only the .vcf and .vcf.idx are created. In the error message it reads:
A USER ERROR has occurred: Mutect stats table calls.vcf.stats not found. When Mutect2 outputs a file calls.vcf it also creates a calls.vcf.stats file. Perhaps this file was not moved along with the vcf, or perhaps it was not delocalized from a virtual machine while running in the cloud.
which sounds related.
Resolved this by adding --tmp-dir argument to command.
Bug Report
Affected tool(s) or class(es)
Affected version(s)
Description
vcf is produce but not .stat
Steps to reproduce
On cluster with SBTACH options (--constraint avx2 , 150Go de ram , 6 CPU):
parallel -k --plus 'gatk Mutect2 -R /omaha-beach/jpollet/MYD88/data/ref/BALBcJ.fasta -I {} \ -O {..}unf.vcf' \ ::: *.md.bam
-> same issue with for loops .