Closed zzbbf123 closed 1 year ago
The coords in VCF are 1-based and inclusive. If you are using some BED based sequence extraction method then you can see this error. In that case, try getting sequence for "DeChr1:169712-169920" and "AlChr1: 796673-796674"
I'm confused about this situation found in the VCF file, the column "REF" is N and the column "ALT" is \<DEL>.
DeChr1 169713 DEL5 N <DEL> . PASS END=169920;ChrB=AlChr1;StartB=796674;EndB=796674;Parent=SYN13;VarType=ShV;DupType=.
I know this can happens when the length of DEL/INS is greater than 100, as discussed in issue 132. https://github.com/schneebergerlab/syri/issues/132
However, when I extracted sequences from the reference genome, things got weird and the sequences in genome A did not match the sequences in genome B at the start position (C ≠ A).
This is just one of the small cases.. I don't know what to make of this situation. Should this type of sv be deleted? Or How do I fix this problem?