fpbarthel / GLASS

GLASS consortium
MIT License
37 stars 13 forks source link

SNV calls data release 2 #120

Closed fpbarthel closed 5 years ago

fpbarthel commented 5 years ago

To-do's related to SNV calls for the second data release

fpbarthel commented 5 years ago

@anderkj2 compared GATK 4.1.0.0 Mutect2 (GATK in figure) variant calls versus GATK 4.0.x Mutect2 calls (freebayes in figure since later combined to a multisample vcf using freebayes to get allelic frequencies):

screen shot 2019-03-01 at 3 08 47 pm

While there is a large overlap of about 20k variants, there is a huge number of GATK 4.0.x M2-only variants (22k). This could be explained by several factors:

screen shot 2019-03-01 at 3 05 50 pm

Comparison of read depths between GATK 4.1 M2 and from Freebayes using the 20k variants that were analyzed in both demonstrates a lower depth from Mutect2 compared to Freebayes. Likely this can be explained by the more stringent read filters that apply in M2. The tail of indels that have a higher read depth in Mutect2 are all indels that Freebayes was not able to accurately count reads for.

fpbarthel commented 5 years ago

Closing this issue, see also Slack discussion @fpbarthel and @anderkj2 .

Two open issues moved to new separate issues