Open andpet0101 opened 1 year ago
This is a serious bug in the pipeline:
When using MarkDuplicatesWithMateCigar, the input BAM files need to have the MC (mate cigar) tag. Otherwise, MarkDuplicatesWithMateCigar will simply include all reads without MC tag in the output (see https://gatk.broadinstitute.org/hc/en-us/articles/360037055692-MarkDuplicatesWithMateCigar-Picard-#--SKIP_PAIRS_WITH_NO_MATE_CIGAR) which means that duplicates are not filtered at all.
I would suggest to add the following lines:
# FIX: Add mate cigar information picard FixMateInformation I=${tmpNameStem}.unsorted.tmpbam O=${tmpNameStem}.unsorted.mc.tmpbam \ VALIDATION_STRINGENCY=LENIENT >& ${tmpNameStem}.unsorted.mc.picardFM.out 2>&1
to process 5 after SamFormatConverter and before SortSam.
This is a serious bug in the pipeline:
When using MarkDuplicatesWithMateCigar, the input BAM files need to have the MC (mate cigar) tag. Otherwise, MarkDuplicatesWithMateCigar will simply include all reads without MC tag in the output (see https://gatk.broadinstitute.org/hc/en-us/articles/360037055692-MarkDuplicatesWithMateCigar-Picard-#--SKIP_PAIRS_WITH_NO_MATE_CIGAR) which means that duplicates are not filtered at all.
I would suggest to add the following lines:
to process 5 after SamFormatConverter and before SortSam.