Open Yiguan opened 3 years ago
You can just not pass in the -sr argument.
What is the mapping quality of that read?
On Jun 17, 2021, at 4:45 PM, Yiguan @.***> wrote:
Hi,
When a region having lots of discordant reads and a few split reads, I find the inferred breakpoints rely too much on the split reads. For example,
3L 500033 2397 N
550.60 . SVTYPE=INV;SVLEN=1343;END=501376;STRANDS=++:38,--:30;IMPRECISE;CIPOS=-7,9;CIEND=-10,9;CIPOS95=-1,5;CIEND95=-3,1;SU=68;PE=67;SR=1 The region has 67 discordant reads and 1 split read. The inferred breakpoint 500033 is heavily relied on the one split read. But I tend to believe that the one split read is likely to be a mapping error or misalignment. I am wondering if there is a way to remove the split read while only use the discordant reads to infer inversions. Or can I set a threshold for the minimum number of split reads? eg. when the number of split read > 5, then these split reads can be used.
Thanks,
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.
The mapping quality of the read is 60, Cigar="49H41M".
Here is the IGV for the read (view as pair):
https://github.com/Yiguan/miscellaneous/blob/main/aa.png
In the screenshot, there are four sample tracks(three samples):
sample_95 discordant track
sample_95 split track
sample_88 discordant track
sample_62 discordant track
sample_95 3L 500033 2397 N <INV> . . SVTYPE=INV;STRANDS=++:38,--:30;SVLEN=1343;END=501376;CIPOS=-7,9;CIEND=-10,9;CIPOS95=-1,5;CIEND95=-3,1;IMPRECISE;SU=68;PE=67;SR=1 GT:SU:PE:SR ./.:68:67:1
sample_88 3L 500334 2221 N <INV> . . SVTYPE=INV;STRANDS=++:23,--:40;SVLEN=771;END=501105;CIPOS=-174,9;CIEND=-40,9;CIPOS95=-122,0;CIEND95=-30,0;IMPRECISE;SU=63;PE=63;SR=0 GT:SU:PE:SR ./.:63:63:0
sample_62 3L 500306 2541 N <INV> . . SVTYPE=INV;STRANDS=++:21,--:22;SVLEN=838;END=501144;CIPOS=-178,9;CIEND=-143,11;CIPOS95=-126,0;CIEND95=-128,2;IMPRECISE;SU=43;PE=43;SR=0 GT:SU:PE:SR ./.:43:43:0
The first breakpoint of the inversion in sample_95 was inferred in the "region1"(in the screenshot), while sample_88 and sample_62, the first breakpoint of the inversion were in "region2". As the three samples are from the same family(siblings), I believe they should have the same inversion and "region1" is unlikely to be a true breakpoint. Just confused why sample_95 having a breakpoint in "region1". I suspect that it may be caused by the split read(second track in the screenshot) as the read falls into the "region1".
Hi,
When a region having lots of discordant reads and a few split reads, I find the inferred breakpoints rely too much on the split reads. For example,
The region has 67 discordant reads and 1 split read. The inferred breakpoint
500033
is heavily relied on the one split read. But I tend to believe that the one split read is likely to be a mapping error or misalignment.I am wondering if there is a way to remove the split read while only use the discordant reads to infer inversions. Or can I set a threshold for the minimum number of split reads? eg. when the number of split read > 5, then these split reads can be used.
Thanks,