Illumina / GTCtoVCF

Script to convert GTC/BPM files to VCF
Apache License 2.0
41 stars 30 forks source link

Indel refallele #10

Closed KelleyRyanM closed 6 years ago

KelleyRyanM commented 6 years ago

Address issue with proper determination of reference allele for certain indels. Generally, can improperly call insertion as deletion when repeated sequence is present inside another occurrence of the same source sequences, as in NNNA[ACG]CGNN

Summary of number of difference is several manifest inputs InfiniumCore-24v1-1_A: 5 entries InfiniumCore-24v1-1_A1: 3 entries GSA-24v1-0_A6: 149 entries

Main logic regarding reference allele determination is now located in BPMRecord.py under "is_deletion" method. Also added script to download and prepare reference data (scripts/download_reference.sh) to support new tests that require full human genome reference.