seqan / iGenVar

The official repository for the iGenVar project.
BSD 3-Clause "New" or "Revised" License
9 stars 8 forks source link

Recheck TANDEM:DUPs #224

Open Irallia opened 2 years ago

Irallia commented 2 years ago

Our results from the mini example are giving us:

#CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  MYSAMPLE
chr1    110 .   N   <DUP:TANDEM>    1   PASS    END=125;SVLEN=16;iGenVar_SVLEN=4;SVTYPE=DUP GT  ./.
chr1    181 .   N   <DUP:TANDEM>    1   PASS    END=188;SVLEN=8;iGenVar_SVLEN=7;SVTYPE=DUP  GT  ./.
chr1    181 .   N   <DUP:TANDEM>    2   PASS    END=188;SVLEN=8;iGenVar_SVLEN=8;SVTYPE=DUP  GT  ./.
chr1    509 .   N   <DUP:TANDEM>    1   PASS    END=529;SVLEN=21;iGenVar_SVLEN=9;SVTYPE=DUP GT  ./.

The real duplications are: ...GGG ATATATTT ATATATTT TAC... <- Tandem Duplication ref: (180, 188] tandem duplicated ...GCG TAACCCGGG TAACCCGGG TAACCCGGG TAACCCGGG TAACCCGGG TAC... Duplication in reference and read covered by SA tag ref: (509, 527] with copynumber=5

Jörg also raised the question why we calculate the sv length as difference +2. _Originally posted by @joergi-w in https://github.com/seqan/iGenVar/pull/223#discussion_r925268668_