NYU-Molecular-Pathology / NGS580-nf

Target exome sequencing analysis for NYU NGS580 gene panel
GNU General Public License v3.0
10 stars 6 forks source link

masked nucleotides showing up in vcf files, breaking merge script #34

Closed stevekm closed 4 years ago

stevekm commented 4 years ago

Lofreq variant caller seems to retain some of the lower-case masked nucleotides in the variants output in its .vcf files. These lower-case nucleotides end up getting converted to upper-case when running the .vcf through GATK VariantsToTable, but they are retained as lower-case when annotating the .vcf with ANNOVAR. This leads to errors when trying to merge the ANNOVAR annotation output with the .vcf tsv file from GATK, since the columns no longer match.

Consider forcing all nucleotides to upper case in the merge script. Or resolve this further upstream.