gagneurlab / drop

Pipeline to find aberrant events in RNA-Seq data, useful for diagnosis of rare disorders
MIT License
133 stars 44 forks source link

rnaVariantCalling - All reads filtered by MappingQualityNotZeroReadFilter #451

Open mincej opened 1 year ago

mincej commented 1 year ago

Hello, I'm getting an error, which seems to be a result of all of my reads being filtered out in the BaseRecalibration step. Please see the info log below. I'm not too sure where to start with this issue, I'm more familiar with RNA-seq analysis and GATK is new to me. I previously had an error that was caused from no read group, so I added a general read group manually with AddOrReplaceReadGroups , so since everything is part of the same read group I'm guessing that something is failing and everything is being filtered out. Does anyone have any suggestions?

18:23:37.095 INFO  BaseRecalibrator - ------------------------------------------------------------
18:23:37.097 INFO  BaseRecalibrator - The Genome Analysis Toolkit (GATK) v4.4.0.0
18:23:37.097 INFO  BaseRecalibrator - For support and documentation go to https://software.broadinstitute.org/gatk/
18:23:37.097 INFO  BaseRecalibrator - Executing as CENSORED on CENSORED
18:23:37.097 INFO  BaseRecalibrator - Java runtime: OpenJDK 64-Bit Server VM v17.0.3-internal+0-adhoc..src
18:23:37.097 INFO  BaseRecalibrator - Start Date/Time: April 7, 2023 at 6:23:37 PM MDT
18:23:37.098 INFO  BaseRecalibrator - ------------------------------------------------------------
18:23:37.098 INFO  BaseRecalibrator - ------------------------------------------------------------
18:23:37.098 INFO  BaseRecalibrator - HTSJDK Version: 3.0.5
18:23:37.098 INFO  BaseRecalibrator - Picard Version: 3.0.0
18:23:37.098 INFO  BaseRecalibrator - Built for Spark Version: 3.3.1
18:23:37.099 INFO  BaseRecalibrator - HTSJDK Defaults.COMPRESSION_LEVEL : 2
18:23:37.099 INFO  BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
18:23:37.099 INFO  BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
18:23:37.099 INFO  BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
18:23:37.099 INFO  BaseRecalibrator - Deflater: IntelDeflater
18:23:37.099 INFO  BaseRecalibrator - Inflater: IntelInflater
18:23:37.100 INFO  BaseRecalibrator - GCS max retries/reopens: 20
18:23:37.100 INFO  BaseRecalibrator - Requester pays: disabled
18:23:37.100 INFO  BaseRecalibrator - Initializing engine
18:23:37.359 INFO  FeatureManager - Using codec VCFCodec to read file CENSORED
18:23:37.468 INFO  FeatureManager - Using codec VCFCodec to read file CENSORED
18:23:37.551 INFO  FeatureManager - Using codec VCFCodec to read file CENSORED
18:23:37.629 INFO  BaseRecalibrator - Done initializing engine
18:23:37.633 INFO  BaseRecalibrationEngine - The covariates being used here: 
18:23:37.633 INFO  BaseRecalibrationEngine -    ReadGroupCovariate
18:23:37.634 INFO  BaseRecalibrationEngine -    QualityScoreCovariate
18:23:37.634 INFO  BaseRecalibrationEngine -    ContextCovariate
18:23:37.634 INFO  BaseRecalibrationEngine -    CycleCovariate
18:23:37.686 INFO  ProgressMeter - Starting traversal
18:23:37.687 INFO  ProgressMeter -        Current Locus  Elapsed Minutes       Reads Processed     Reads/Minute
18:24:04.268 WARN  IntelInflater - Zero Bytes Written : 0
18:24:04.275 INFO  BaseRecalibrator - 34429100 read(s) filtered by: MappingQualityNotZeroReadFilter 
0 read(s) filtered by: MappingQualityAvailableReadFilter 
0 read(s) filtered by: MappedReadFilter 
0 read(s) filtered by: NotSecondaryAlignmentReadFilter 
0 read(s) filtered by: NotDuplicateReadFilter 
0 read(s) filtered by: PassesVendorQualityCheckReadFilter 
0 read(s) filtered by: WellformedReadFilter 
34429100 total reads filtered out of 34429100 reads processed
18:24:04.276 INFO  ProgressMeter -             unmapped              0.4                     0              0.0
18:24:04.276 INFO  ProgressMeter - Traversal complete. Processed 0 total reads in 0.4 minutes.
18:24:04.276 INFO  BaseRecalibrator - Calculating quantized quality scores...
18:24:04.286 INFO  BaseRecalibrator - Writing recalibration report...
18:24:04.321 INFO  BaseRecalibrator - ...done!
18:24:04.322 INFO  BaseRecalibrator - BaseRecalibrator was able to recalibrate 0 reads
18:24:04.322 INFO  BaseRecalibrator - Shutting down engine
[April 7, 2023 at 6:24:04 PM MDT] org.broadinstitute.hellbender.tools.walkers.bqsr.BaseRecalibrator done. Elapsed time: 0.45 minutes.
Runtime.totalMemory()=1224736768
Tool returned:
SUCCESS
nickhsmith commented 1 year ago

Sorry for the long delay, I'll try to help as I can. Maybe @vyepez88 can help too

the MappingQualityNotZeroReadFilter does the following: Filter out reads with mapping quality equal to zero

So it shouldn't be associated with the read group (although I believe the current version of DROP should auto-detect a lack of read group and apply a dummy group).

Have you taken a look at your reads and look at column 5 (you can do this using samtools -h | less) and scrolled down to your actual reads, my guess is that column 5 (just before the CIGAR string) is all 0s. But that's just a guess, I think you would have to talk to whoever ran generated the bam files to make sure that everything went well at alignment. You could artificially set them to 255 which I think is code for unknown. What that will do to the quality of the experiment I don't know, but it should allow the step to continue.

vyepez88 commented 10 months ago

Hi Joshua, is this still an issue?