fritzsedlazeck / SURVIVOR

Toolset for SV simulation, comparison and filtering
MIT License
337 stars 46 forks source link

Bug found: SV LEN value cutted #213

Open leone93 opened 4 months ago

leone93 commented 4 months ago

Hy Fritz, I just found his bug trying to merge the SV coming from 5 different callers (pbsv,svim,sniffles,cutesv,svim-asm) for the same sample.

This are only two example (just a part of the vcf line) but the files are full of that: Chr11 5514126 svim_asm.INS.11708,pbsv.INS.27728 G GAATACATATATCCAAGGAAAAAGTACGAATTACACCCC> SVLEN=84 Chr11 5514126 svim.INS.17070,cuteSV.INS.2923 G GAATACATATATCCAAGGAAAAAGTACGAATTACACCCCTGAACTAT> SVLEN=1584

Chr11 5710087 svim_asm.INS.11719,pbsv.INS.27758 G GCTATTTGTAACACTCTGAAAATTCGACCGACAAAATCA> SVLEN=60 Chr11 5710087 svim.INS.17086,Sniffles2.INS.162SA,cuteSV.INS.2935 G GGCTATTTGTAACACTCTGAAAA> SVLEN=3760

Two notes: the original file are parsed/simplified to make the merging work of survivor more easy reducing most of the field. Where the error is present, the input files are perfect. There the original value is as for the other 3 caller, 1584 and 3760. It seems only that for the first two, pbsv and svm-asm, (the last 2 in the sample list that I give to survivor) it cut the the first two number of the SVLEN, so from 1584 to 84, from 3760 to 60; to me it is very weird but it is everywhere. Any idea?