griffithlab / neoag_protocol

Protocol for end-to-end neoantigen analysis and vaccine design for a single patient
MIT License
1 stars 0 forks source link

Somatic Exome CWL : Investigate difference in input reads on compute0 vs AWS #1

Closed jhundal closed 4 years ago

jhundal commented 4 years ago

Samtools flagstat from the alignment step in somatic_exome.cwl reports different number of input reads between compute0 versus AWS.

samtools flagstat output on the refAlign.bam from somatic tumor alignment on compute0:

33103050 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 secondary
338484 + 0 supplementary
0 + 0 duplicates
32960425 + 0 mapped (99.57% : N/A)
32764566 + 0 paired in sequencing
16382283 + 0 read1
16382283 + 0 read2
32109294 + 0 properly paired (98.00% : N/A)
32483958 + 0 with itself and mate mapped
137983 + 0 singletons (0.42% : N/A)
286826 + 0 with mate mapped to a different chr
175713 + 0 with mate mapped to a different chr (mapQ>=5)

samtools flagstat from same step on AWS:

32944440 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 secondary
179874 + 0 supplementary
0 + 0 duplicates
32802212 + 0 mapped (99.57% : N/A)
32764566 + 0 paired in sequencing
16382283 + 0 read1
16382283 + 0 read2
32133090 + 0 properly paired (98.07% : N/A)
32484720 + 0 with itself and mate mapped
137618 + 0 singletons (0.42% : N/A)
276810 + 0 with mate mapped to a different chr
168336 + 0 with mate mapped to a different chr (mapQ>=5)

It seems like some of the reference files that aren’t explicitly referred to in the CWL will be used if they’re available, such as an alts file for BWA alignment.

@bryanfisk Investigate if running this step with all optional input files on AWS will fix the issue.

bryanfisk commented 4 years ago

After running the step with the optional reference files present on AWS the outputs matched those on compute0.