Illumina / REViewer

A tool for visualizing alignments of reads in regions containing tandem repeats
GNU General Public License v3.0
73 stars 14 forks source link

Repeat units differ significantly in VCF and SVG #55

Closed marc-sturm closed 1 year ago

marc-sturm commented 1 year ago

Hi,

we have a case where there seems to be bug in REViewer. The repeat units for C9ORF72 are 2/677 according to the VCF generated by ExpansionHunter 5.0.0: chr9 27573528 . C , . PASS END=27573546;REF=3;RL=18;RU=GGCCCC;VARID=C9ORF72;REPID=C9ORF72 GT:SO:REPCN:REPCI:ADSP:ADFL:ADIR:LC 1/2:SPANNING/INREPEAT:2/677:2-2/628-970:27/0:7/41:0/612:57.641694

However when running REViewer 0.2.7 the SVG image shows 386 repeat units for the second allele: DNA2204013A1_02_repeats_expansionhunter_C9ORF72

This file should contain everything you need to replicate the issue: REviewerissue.zip

If you need more information or data, I can provide that.

Best, Marc

Note to me: it is in sample DNA2204013A1_02

sclamons commented 1 year ago

Hi Marc,

Unfortunately you've run aground on a known limitation of REViewer. ExpansionHunter doesn't try to align read pairs where both reads lie entirely inside the repeat. If REViewer runs across such a pair, it will lower its size estimate for that locus and generate a plot without the in-repeat reads.

This is in our documentation here: https://github.com/Illumina/REViewer/blob/master/docs/method-overview.md The current version of REViewer visualizes repeats whose span does not exceed the fragment length and longer repeats are capped at the fragment length.

marc-sturm commented 1 year ago

Thanks a lot for the explaination!