xjtu-omics / SVision

Detecting genome structural variants with deep learning in single molecule sequencing
GNU General Public License v3.0
101 stars 10 forks source link

SV description confusion regarding the difference between DUP and tDUP #11

Closed bo8883 closed 7 months ago

bo8883 commented 2 years ago

how does DUP vs tDUP change the interpretation of this these two variant examples? This is Table S9 from the Nature Methods publication on SVision. From the description,

chr12 124,438,675 124,439,617 15625 6 INS:320-124439627-124439628,tDUP:933-124438693-124439627

this means that chr12: 124,438,675-124,439,617 is changed to insertion of chr12: 124439627-124439628 followed by chr12:124438693-124439627. (or maybe chr12:124438693-124439627 is duplicated again because it’s tDUP?)

chr5 51,925,593 51,927,635 6954 40 INS:2358-51925589-51925610,DUP:72-51927330-51927409

This means that chr5: 51,925,593-51,927,635 interval is changed to insertion of chr5:51925589-51925610 followed by chr5:51927330-51927409

In these two cases, there's no difference between how tDUP and DUP changes the structure of the complex SV.

So why is one described as INS:DUP and one as INS:tDUP?

Thank you for your response.

jiadong324 commented 2 years ago
chr12 124,438,675 124,439,617 15625 6 INS:320-124439627-124439628,tDUP:933-124438693-124439627

This means that chr12: 124,438,675-124,439,617 is changed to insertion of chr12: 124439627-124439628 followed by chr12:124438693-124439627. (or maybe chr12:124438693-124439627 is duplicated again because it’s tDUP?)

Answer: Your description is correct. At locus chr12:124,438675-124,439,617, SVision first identifies inserted sequence of length 320bp at chr12:124439627, and part of this inserted sequence is from tandem repeat sequence. In other word, SVision identifies the source of part of this inserted sequence. This is very common, such as the CSV at CNTN5 described in our paper, and we termed this as complex insertion.