Closed Dkyuan closed 2 years ago
The arrangement of end-points of consecutive alignments in an annotation block (grey alignments) is analysed to find structural variations.
When identifying indels between two neighboring alignments, syri allows some overlap between the alignments which in turn results in this behaviour. This is controlled by the --allowed-offset
parameter.
This overlap, in turn can result in such start and end positions. For practical usage, using the start position should be fine.
The arrangement of end-points of consecutive alignments in an annotation block (grey alignments) is analysed to find structural variations.
When identifying indels between two neighboring alignments, syri allows some overlap between the alignments which in turn results in this behaviour. This is controlled by the
--allowed-offset
parameter.This overlap, in turn can result in such start and end positions. For practical usage, using the start position should be fine.
Thanks very much for your help.
Now I noted that the default BPs allowed to overlap is 5bp, to avoid such start and end positions, I should set parameter --allow-offset
to OFFSET
, is that right ?
Now I will use the start position to move forward.
Thanks again !
I should set parameter --allow-offset to OFFSET, is that right ?
No. You need to set --allow-offset
to an integer value. To avoid such start/end positions use --allow-offset 0
.
However, when two alignments have more than OFFSET base-pairs overlapping, they are annotated as copyloss/copygain (check supplementary figure S8 of syri paper). So, if OFFSET=0, then even a 1BP overlap between alignments would result in copy-change. Generally, this is not desired, and therefore OFFSET value of 5 helps in restricting false copy-change calls. You can try different values to adjust it as per your requirements.
@mnshgl0110
I got it. Thank you very much.
Hi, sir:
When I was checking the large InDels (PAV) in the vcf files, I was confused with the positions of the variations. [I check the position because I want to extract the PAV sequences] As some examples below: Positions of the sencond and fourth insersions [ID "INS2" and "INS13"] are what I expected. The start position and End position on the ref. genome are equal to each other, and StartB is less than EndB. So, can someone help me understand what happend when:
Similar for deletions, "DEL18" is the same as my expectation: the start position (405833) is smaller than the End position (406191), and the StartB is equal to EndB (427915) ; but what happend: as "DEL17", the StartB (419797) is greater than EndB (419795)
Thanks for your help ~
Xuan.