tfwillems / HipSTR

Genotype and phase short tandem repeats using Illumina whole-genome sequencing data
GNU General Public License v2.0
94 stars 31 forks source link

HipSTR output #64

Closed CodyS5 closed 5 years ago

CodyS5 commented 5 years ago

Hello,

I've been running HipSTR on a small dataset, and I'm a little confused by the output here. This is the output from a single locus, where the first individual has a 0/1 genotype, if I'm interpreting it correctly. Is the second part of this (0|0) supposed to indicate the difference in length between the genotype and the reference? I guess I'm unsure of how two different genotypes have the same length value here.

1:55388840 0|1:0|0:1.0:0.5:33:0:0:0:25.02|7.98:0|0:2.92:-2.48:31:-2.61:0|27:0|27:PASS 0|0:0|0:1.0:1.0:35:0:0:0:17.50|17.50:0|0:5.94:0.0:.:0.0:0|23:0|23:PASS 0|0:0|0:1.0:1.0:34:0:0:2:17.00|17.00:0|0:8.93:0.0:.:0.0:0|28:0|24:PASS

Thanks for any help you can offer!

tfwillems commented 5 years ago

Hi Cody,

Yes, that's correct. The second component (0|0) corresponds to the GB field, which indicates the length differences of the 2 alleles from the reference allele. In this case, allele 1 has the same length as the reference allele, but its sequence is different. You can verify that by looking at the ALT field for the corresponding VCF record

Best, Thomas