nhansen / SVanalyzer

Tools for the analysis of structural variation in genomes
http://svanalyzer.readthedocs.io/
Other
76 stars 14 forks source link

Why Some REPTYPE=CONTRAC Records Have Longer ALT Length than REF Length #18

Open yueyaog opened 9 months ago

yueyaog commented 9 months ago

Hi @nhansen,

I am going through a SVWIDEN output of HG002 VCF. I have noticed that there are some variants that were annotated REPTYPE=CONTRAC has longer ALT than REF. I expect CONTRAC records should have shorter ALT length than REF length because CONTRAC is a type of DEL. I have attached an instance below.

#CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  syndip
chr1    1108885 .   G   GTCCACCACAGCCACCATGTCTCGGCAGCACCGTCCACCACAGCCACCATGTCTCGGCAGCACCGTTCACCACAGCCACCATGTCTCAGCACCA  30  .   REPTYPE=CONTRAC;BREAKSIMLENGTH=7699;REFWIDENED=chr1:1108048-1115663 GT:AD   0|1:1,1

Do you have any explanations for this?

nhansen commented 9 months ago

Hi @yueyaog. That's clearly a bug--can you provide me the initial VCF entry and the reference it's aligned to?

yueyaog commented 9 months ago

Sure. I didn't generate this SVWIDEN output myself. I got a copy of it from Justin Zook from NIST. I will send you a separate email with details.