Closed wangshun1121 closed 4 years ago
@wangshun1121 That definitely sounds like a bug. Is it possible for you to try running with the newest released version 4.1.3.0 in order to test if it's still present?
@lbergelson I am trying latest release, I use following command:
./gatk-4.1.3.0/gatk --java-options "-Xmx4g" FilterMutectCalls -O Filtered.vcf -V Try.vcf.gz -R ~/human.fa/ucsc.hg19.fasta
and got following Info:
Using GATK jar /mnt/md0/DataProcess/Ranshi/Mutect2/gatk-4.1.3.0/gatk-package-4.1.3.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xmx4g -jar /mnt/md0/DataProcess/Ranshi/Mutect2/gatk-4.1.3.0/gatk-package-4.1.3.0-local.jar FilterMutectCalls -O Filtered.vcf -V Try.vcf.gz -R /home/imp/human.fa/ucsc.hg19.fasta
09:44:27.763 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/mnt/md0/DataProcess/Ranshi/Mutect2/gatk-4.1.3.0/gatk-package-4.1.3.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
Aug 21, 2019 9:44:29 AM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
09:44:29.499 INFO FilterMutectCalls - ------------------------------------------------------------
09:44:29.500 INFO FilterMutectCalls - The Genome Analysis Toolkit (GATK) v4.1.3.0
09:44:29.500 INFO FilterMutectCalls - For support and documentation go to https://software.broadinstitute.org/gatk/
09:44:29.500 INFO FilterMutectCalls - Executing as imp@imp-WorkStation on Linux v4.15.0-55-generic amd64
09:44:29.500 INFO FilterMutectCalls - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_222-8u222-b10-1ubuntu1~16.04.1-b10
09:44:29.501 INFO FilterMutectCalls - Start Date/Time: 2019年8月21日 上午09时44分27秒
09:44:29.501 INFO FilterMutectCalls - ------------------------------------------------------------
09:44:29.501 INFO FilterMutectCalls - ------------------------------------------------------------
09:44:29.502 INFO FilterMutectCalls - HTSJDK Version: 2.20.1
09:44:29.502 INFO FilterMutectCalls - Picard Version: 2.20.5
09:44:29.502 INFO FilterMutectCalls - HTSJDK Defaults.COMPRESSION_LEVEL : 2
09:44:29.502 INFO FilterMutectCalls - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
09:44:29.502 INFO FilterMutectCalls - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
09:44:29.503 INFO FilterMutectCalls - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
09:44:29.503 INFO FilterMutectCalls - Deflater: IntelDeflater
09:44:29.503 INFO FilterMutectCalls - Inflater: IntelInflater
09:44:29.503 INFO FilterMutectCalls - GCS max retries/reopens: 20
09:44:29.503 INFO FilterMutectCalls - Requester pays: disabled
09:44:29.503 INFO FilterMutectCalls - Initializing engine
09:44:29.869 INFO FeatureManager - Using codec VCFCodec to read file file:///mnt/md0/DataProcess/Ranshi/Mutect2/Try.vcf.gz
09:44:29.942 INFO FilterMutectCalls - Done initializing engine
09:44:30.029 INFO FilterMutectCalls - Shutting down engine
[2019年8月21日 上午09时44分30秒] org.broadinstitute.hellbender.tools.walkers.mutect.filtering.FilterMutectCalls done. Elapsed time: 0.04 minutes.
Runtime.totalMemory()=2097152000
***********************************************************************
A USER ERROR has occurred: Mutect stats table Try.vcf.gz.stats not found. When Mutect2 outputs a file calls.vcf it also creates a calls.vcf.stats file. Perhaps this file was not moved along with the vcf, or perhaps it was not delocalized from a virtual machine while running in the cloud.
***********************************************************************
Set the system property GATK_STACKTRACE_ON_USER_EXCEPTION (--java-options '-DGATK_STACKTRACE_ON_USER_EXCEPTION=true') to print the stack trace.
Try.vcf.stat
was lost, I have to re-run Mutect2
using latest release. It seems that this bug was fixed.
It's OK using gatk-4.1.3
@wangshun1121 Good to hear. I love bugs that have already been fixed :)
@wangshun1121 @lbergelson
Thanks for reporting this. I'm confused -- in the latest release, is the optional stats file now not required?
I think the optional stats are now no longer optional. @takutosato could say for sure though.
Thanks @lbergelson
Indeed on master the command fails, however the argument seems optional
I would imagine a fair number of users may not have this file saved (like myself). Would it be possible to maintain some backwards compatibility here?
@andrewrech it seems that some modules, such as FilterMutectCalls, CreateSomaticPanelOfNormals, etc, are changing significantly in recent gatk releases.
Ah, that looks like a bug. The argument should be required I think. I opened a ticket https://github.com/broadinstitute/gatk/issues/6124.
I think the whole Mutect2 / FilterMutectCalls chain of tools are pretty tightly coupled together and are evolving fairly quickly. So it's not a great idea to try to re-filter calls made with an earlier version of Mutect with a newer one.
In my pipeline, when I called somatic variants using Mutect2, I would mark variants using FilterMutectCalls immediately. I think that FilterMutectCalls should be merged into Mutect2. What's more, I ran Mutect2 by chromosome parallelly by myself, and finally I would merge variants on different chrs into one VCF file-That means that saving .stat file of final result is impossible!
Hello all,
Thank you.
My pipeline is as @wangshun1121 describes as well - I run regions in parallel.
I would prefer if this argument remained optional, as this will permanently break pipelines running Mutect2 on regions in parallel.
Does anyone know offhand what old version I can pin to as a work around?
@lbergelson
@andrewrech I'm not certain. Let's let @davidbenjamin or @takutosato weigh in before we get too far down this rabbit hole 🐇. I'm not super knowledgeable about the details of mutect and don't want to give any bad information.
@andrewrech @wangshun1121 you can merge the stats from the parallel runs of Mutect using MergeStats
as is done in the best practice wdl:
https://github.com/broadinstitute/gatk/blob/master/scripts/mutect2_wdl/mutect2.wdl#L331
@wangshun1121 We decoupled FilterMutectCalls
from Mutect2
a while back in order to enable quick experimentation with different filtering arguments. Since then, a new reason has come up: FilterMutectCalls
looks at data from the entire sample, and therefore merging it into Mutect2
is not compatible with scattering over different genomic regions. By the way, have you tried using the best practices WDL? Our goal is to spare users the nuisance of writing pipelines themselves.
@davidbenjamin I develop my pipelines with perl & Nextflow myself: it's necessary for me. And I would look up best practices WDL scripts when coding my own pipelines.
Thanks @wangshun1121; it's always helpful to understand how our tools are being used in the wild.
It's OK using gatk-4.1.3
@lbergelson I am getting the same error as @wangshun1121 mentions above. I'm using version 4.1.4.0. I am assuming this version has the bug fix too like v4.1.3 that works for @wangshun1121 ?
Thanks.
I using following command to filter my vcf file:
And failed with following info:
This is the vcf files. Try.vcf.gz