molgenis / vkgl-vcf-converter

Converts http://molgenis.org/vkgl downloads to VCF (Variant Call Format) files
GNU Lesser General Public License v3.0
0 stars 0 forks source link

What is the genome reference sequence? #6

Closed dennishendriksen closed 3 years ago

dennishendriksen commented 3 years ago

Working with human_g1k_v37_phiX.fasta we run into the following errors:

Reference allele mismatch at Y:545475 .. REF_SEQ:'N' vs VCF:'C'
Reference allele mismatch at MT:5461 .. REF_SEQ:'C' vs VCF:'G'
Reference allele mismatch at MT:13880 .. REF_SEQ:'C' vs VCF:'T'

What is the genome reference sequence for VKGL data? https://gatk.broadinstitute.org/hc/en-us/articles/360035890711-GRCh37-hg19-b37-humanG1Kv37-Human-Reference-Discrepancies#comparison

dennishendriksen commented 3 years ago

GRCh37 (GCA_000001405.1), see https://github.com/molgenis/data-transform-vkgl/issues/25.