arq5x / lumpy-sv

lumpy: a general probabilistic framework for structural variant discovery
MIT License
314 stars 118 forks source link

Number of split reads supporting the variant is always 0 #350

Open yasin-uzun opened 3 years ago

yasin-uzun commented 3 years ago

I first compute split and discordant reads then run lumpyexpress as recommended:

lumpyexpress \
    -B my.bam \
    -S my.splitters.bam \
    -D my.discordants.bam \
    -o output.vcf

When I check the output vcf file I see that for all the variants, the only evidence is discordant reads. I see SR=0 for all the variants called. Is this normal? My samples are quite deep (100x mean coverage). Is it related with that? Or something else?

ryanlayer commented 3 years ago

That is not normal. What is the library? How did you align your reads?

On Dec 16, 2020, at 6:55 PM, yasin-uzun notifications@github.com wrote:

 I first compute split and discordant reads then run lumpyexpress as recommended:

lumpyexpress \ -B my.bam \ -S my.splitters.bam \ -D my.discordants.bam \ -o output.vcf When I check the output vcf file I see that for all the variants, the only evidence is discordant reads. I see SR=0 for all the variants called. Is this normal? My samples are quite deep (100x mean coverage). Is it related with that? Or something else?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.

yasin-uzun commented 3 years ago

Thank you very much for the fast reply. It is Ilumina data. I aligned them by BWA.

yasin-chop commented 3 years ago

Here are some example SV calls:

chr1    934087  1   N   <DEL>   .   .   SVTYPE=DEL;STRANDS=+-:14;SVLEN=-764;END=934851;CIPOS=-10,259;CIEND=-272,9;CIPOS95=-3,38;CIEND95=-43,2;IMPRECISE;SU=14;PE=14;SR=0    GT:SU:PE:SR ./.:14:14:0
chr1    1617351 2   N   <DEL>   .   .   SVTYPE=DEL;STRANDS=+-:5;SVLEN=-746;END=1618097;CIPOS=-10,650;CIEND=-602,9;CIPOS95=-2,153;CIEND95=-147,2;IMPRECISE;SU=5;PE=5;SR=0    GT:SU:PE:SR ./.:5:5:0
chr1    1657118 3   N   <DUP>   .   .   SVTYPE=DUP;STRANDS=-+:4;SVLEN=65444;END=1722562;CIPOS=-684,9;CIEND=-10,360;CIPOS95=-136,2;CIEND95=-2,106;IMPRECISE;SU=4;PE=4;SR=0   GT:SU:PE:SR ./.:4:4:0
ryanlayer commented 3 years ago

Bwa mem?

On Dec 16, 2020, at 8:48 PM, yasin-uzun notifications@github.com wrote:

 Thank you very much for the fast reply. It is Ilumina data. I aligned them by BWA.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

yasin-uzun commented 3 years ago

Yes, bwa mem

ryanlayer commented 3 years ago

Hm, I’m not sure. It looks like it is not finding split reads, which is surprising. Can you verify they are in there? Check out the Sam spec for more details. The spec calls them chimeric alignments

On Dec 16, 2020, at 8:52 PM, yasin-uzun notifications@github.com wrote:

 Yes, bwa mem

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

yasin-uzun commented 3 years ago

I tried it both for my own data and NA12878 data. I got the same results. My workflow is like this: bwa-mem -> sort by read id -> samblaster -> discordant and split reads -> lumpyexpress .
Sure, I will double check. Thank you very much.

ryanlayer commented 3 years ago

Is your split read file empty?

On Dec 16, 2020, at 9:00 PM, yasin-uzun notifications@github.com wrote:

 I tried it both for my own data and NA12878 data. I got the same results. My workflow is like this: bwa-mem -> sort by read id -> samblaster -> discordant and split reads -> lumpyexpress .

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

yasin-uzun commented 3 years ago

No actually. Theee split file size is 1.6G. There are a lot of alignments in there. Example:

A00564:244:H3LGKDSXY:1:1377:6650:16329_1    113 chr1    9998    0   82S69M  chr4    190122811   0   CTAACCCTAACCCTACCCCTAACCCTAACAATAACCCTAACCATATAGGTTTCCCTAACGGTTTCGCTTACAGTATAAATATCGATATCCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCC F,:,FFF,::FFF,,,F,,F,,FF,,:FF:,,FF,,FF,FFF,,F,::,:,:FF,F:,F,,::,F,F,,F,,,,,::,,FF:,,,,F:FFFFFFFFFFFFFFFFFFF::FF:FFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF SA:Z:chr12_GL877875v1_alt,265,-,42M109S,0,2;    MC:Z:70S81M MD:Z:5A63   RG:Z:7767_765_REG1  NM:i:1  MQ:i:0  AS:i:64 XS:i:63
A00564:244:H3LGKDSXY:1:2101:11189:33144_1   113 chr1    9998    0   82S69M  chr2    242183414   0   CTACCCCTCCCCCTAACCCTAACACTAACCCTAACCCTAACCCTAACCCTATCCCTATCGGTTTCCCTTACAATTTAACTATCGATATCCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCC F,,,FFF,,,FFF,:,F,,FF,F,F:F::,,,:,,,FFFFF,,FF,,,,F,,FF,F,:F,,,,,F,F:,:,:,,,,,,,FF,,,,,F,,FFFF:FFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF SA:Z:chr6,1114756,-,10S38M103S,0,0; MC:Z:57S94M MD:Z:5A63   RG:Z:7767_765_REG1  NM:i:1  MQ:i:5  AS:i:64 XS:i:64
A00564:244:H3LGKDSXY:1:2214:7681:35055_1    81  chr1    9998    0   100S51M chr7    10029   0   CTCACCCCCCCCCGACCCCTAACCCTAACCCTAACCCTAACCCTAACCCTACGGCTTACGCTATCGGTGTCCGTATCGTGTGCTCTGAGATGAGCACTAGCGATAACCCTAACCCTAACCCTTACCCTAACCCTAACCCTAACCCTAACCC F,,,FFF,:,FFF,,:FFF,,,FFF,F,FF::,:::,F,,F,,,,F,,F,F,,,FF,:F,:,FF,::F,,F,,F,,:,:::,F,F,,:,,,,,,,:,FF,,,,,F,FFFFFFF:FFFFFFFF,FFFFFFFFFFFFFFFFFFFFFFFFFFFF SA:Z:chr1,180957,-,14S37M100S,0,0;  MC:Z:58M3I27M63S    MD:Z:22A28  RG:Z:7767_765_REG1  NM:i:1  MQ:i:0  AS:i:46 XS:i:43
A00564:244:H3LGKDSXY:2:1262:19135:7106_1    81  chr1    9998    0   82S69M  chr5    10940   0   CGAACCCGAACCCTAACCCTAACCCTAACCCTACCCCTAACCCTAAAGGTCTCCGTATCGATGTCTCTTAGATTATAAATATCGATATCCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCC F,,,FFF,,,F:F,,::,,F,:F::,FF,:::F,:,:F,FF,,,F,,,,F,,FF,F,,F,,,,,F,F,,:,,,F,,,:,:F,,,,::,FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF SA:Z:chr4,190123039,+,105S46M,0,1;  MC:Z:74M77S MD:Z:5A63   RG:Z:7767_765_REG1  NM:i:1  MQ:i:0  AS:i:64 XS:i:63

But I realized that all the split alignment read pairs are in different chromosomes, which LUMPY doesn't support. There are no splits in the same chromosome. I guess this issue: Probably chimeric reads are missing in my alignment file. I will take a closer look. Thank you very much for your help.

ryanlayer commented 3 years ago

You should check out our new wrapper smoove.

https://github.com/brentp/smoove

On Wed, Dec 16, 2020 at 9:09 PM yasin-uzun notifications@github.com wrote:

No actually. Theee split file size is 1.6G. There are a lot of alignments in there. Example:

A00564:244:H3LGKDSXY:1:1377:6650:16329_1 113 chr1 9998 0 82S69M chr4 190122811 0 CTAACCCTAACCCTACCCCTAACCCTAACAATAACCCTAACCATATAGGTTTCCCTAACGGTTTCGCTTACAGTATAAATATCGATATCCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCC F,:,FFF,::FFF,,,F,,F,,FF,,:FF:,,FF,,FF,FFF,,F,::,:,:FF,F:,F,,::,F,F,,F,,,,,::,,FF:,,,,F:FFFFFFFFFFFFFFFFFFF::FF:FFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF SA:Z:chr12_GL877875v1_alt,265,-,42M109S,0,2; MC:Z:70S81M MD:Z:5A63 RG:Z:7767_765_REG1 NM:i:1 MQ:i:0 AS:i:64 XS:i:63 A00564:244:H3LGKDSXY:1:2101:11189:33144_1 113 chr1 9998 0 82S69M chr2 242183414 0 CTACCCCTCCCCCTAACCCTAACACTAACCCTAACCCTAACCCTAACCCTATCCCTATCGGTTTCCCTTACAATTTAACTATCGATATCCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCC F,,,FFF,,,FFF,:,F,,FF,F,F:F::,,,:,,,FFFFF,,FF,,,,F,,FF,F,:F,,,,,F,F:,:,:,,,,,,,FF,,,,,F,,FFFF:FFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF SA:Z:chr6,1114756,-,10S38M103S,0,0; MC:Z:57S94M MD:Z:5A63 RG:Z:7767_765_REG1 NM:i:1 MQ:i:5 AS:i:64 XS:i:64 A00564:244:H3LGKDSXY:1:2214:7681:35055_1 81 chr1 9998 0 100S51M chr7 10029 0 CTCACCCCCCCCCGACCCCTAACCCTAACCCTAACCCTAACCCTAACCCTACGGCTTACGCTATCGGTGTCCGTATCGTGTGCTCTGAGATGAGCACTAGCGATAACCCTAACCCTAACCCTTACCCTAACCCTAACCCTAACCCTAACCC F,,,FFF,:,FFF,,:FFF,,,FFF,F,FF::,:::,F,,F,,,,F,,F,F,,,FF,:F,:,FF,::F,,F,,F,,:,:::,F,F,,:,,,,,,,:,FF,,,,,F,FFFFFFF:FFFFFFFF,FFFFFFFFFFFFFFFFFFFFFFFFFFFF SA:Z:chr1,180957,-,14S37M100S,0,0; MC:Z:58M3I27M63S MD:Z:22A28 RG:Z:7767_765_REG1 NM:i:1 MQ:i:0 AS:i:46 XS:i:43 A00564:244:H3LGKDSXY:2:1262:19135:7106_1 81 chr1 9998 0 82S69M chr5 10940 0 CGAACCCGAACCCTAACCCTAACCCTAACCCTACCCCTAACCCTAAAGGTCTCCGTATCGATGTCTCTTAGATTATAAATATCGATATCCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCC F,,,FFF,,,F:F,,::,,F,:F::,FF,:::F,:,:F,FF,,,F,,,,F,,FF,F,,F,,,,,F,F,,:,,,F,,,:,:F,,,,::,FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF SA:Z:chr4,190123039,+,105S46M,0,1; MC:Z:74M77S MD:Z:5A63 RG:Z:7767_765_REG1 NM:i:1 MQ:i:0 AS:i:64 XS:i:63

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/arq5x/lumpy-sv/issues/350#issuecomment-747192223, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAEUGUNHJET3SIVKFMERXG3SVGABHANCNFSM4U63PD7Q .

yasin-uzun commented 3 years ago

I sure will. Thank you very much.

yasin-uzun commented 3 years ago

Thank you very much again for guidance. I am looking at the smoove. I have one more question. I also ran lumpy on the WGS data for NA12878 for control. I noticed that there are a lot of split (chimeric) reads with SA tag, both on same and different chromosomes.

$ samtools view SRR622457.filt.10x.samblaster.splitters.sorted.bam | grep -v chrM | head | cut -f 1-9,12-16
SRR622457.392240735_2   129 chr1    9999    0   37S64M  chr5    18606620    0   NM:i:0  MD:Z:64 AS:i:64 XS:i:62 SA:Z:chr5,18606943,-,55S46M,0,0;
SRR622457.392240745_2   129 chr1    9999    0   29S66M6S    chr5    18606608    0   NM:i:0  MD:Z:66 AS:i:66 XS:i:65 SA:Z:chr5,18606943,-,63S38M,0,0;
SRR622457.1025666611_1  97  chr1    10182   0   65M36S  chr16   69831   0   NM:i:2  MD:Z:0A53A10    AS:i:59 XS:i:58 SA:Z:chrX,155249783,-,4S31M1I15M50S,0,2;
SRR622457.384029444_2   177 chr1    10230   0   12S12M1I45M31S  chr12   133841726   0   NM:i:1  MD:Z:57 AS:i:50 XS:i:50 SA:Z:chr8,170448,-,59S42M,0,3;
SRR622457.3727_2    177 chr1    10299   0   46S31M1D23M1S   chrX    155260174   0   NM:i:3  MD:Z:5A5A19^C23 AS:i:37 XS:i:36 SA:Z:chr12,9568754,+,49S41M11S,0,1;
SRR622457.384027060_2   177 chr1    10353   0   40S23M1D37M1S   =   249240338   249230032   NM:i:3  MD:Z:23^A8C22A5 AS:i:43 XS:i:43 SA:Z:chr22,16477923,-,6S58M37S,0,4;
SRR622457.1304459224_1  65  chr1    10359   0   35M66S  chr21   48119493    0   NM:i:0  MD:Z:35 AS:i:35 XS:i:35 SA:Z:chr1,249239781,-,45M56S,0,3;
SRR622457.384027592_2   177 chr1    10378   0   33S65M3S    =   249240500   249230161   NM:i:0  MD:Z:65 AS:i:65 XS:i:65 SA:Z:chr9,10417,-,35M66S,0,1;
SRR622457.2904_1    81  chr1    10388   0   37S64M  chr18   10359   0   NM:i:1  MD:Z:55C8   AS:i:59 XS:i:59 SA:Z:chr12,133817820,+,45S39M1D17M,0,4;
SRR622457.1025665812_1  97  chr1    17179   0   59M42S  chr16   68385   0   NM:i:0  MD:Z:59 AS:i:59 XS:i:59 SA:Z:chr2,114352483,-,42M59S,0,0;

However, in the output VCF file, always SR=0

$ grep -v "##" SRR622457.10x.exclude_LCR_only.vcf | head
#CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  SRR622457
chrM    1104    1   N   <DEL>   .   .   SVTYPE=DEL;STRANDS=+-:12;SVLEN=-113;END=1217;CIPOS=-10,111;CIEND=-104,9;CIPOS95=-4,9;CIEND95=-30,3;IMPRECISE;SU=12;PE=12;SR=0   GT:SU:PE:SR ./.:12:12:0
chrM    1487    2   N   <DEL>   .   .   SVTYPE=DEL;STRANDS=+-:7;SVLEN=-302;END=1789;CIPOS=-10,300;CIEND=-260,9;CIPOS95=-1,155;CIEND95=-107,1;IMPRECISE;SU=7;PE=7;SR=0   GT:SU:PE:SR ./.:7:7:0
chrM    2412    3   N   <DEL>   .   .   SVTYPE=DEL;STRANDS=+-:29;SVLEN=-124;END=2536;CIPOS=-10,119;CIEND=-15,9;CIPOS95=-3,26;CIEND95=-13,2;IMPRECISE;SU=29;PE=29;SR=0   GT:SU:PE:SR ./.:29:29:0
chrM    2873    4   N   <DEL>   .   .   SVTYPE=DEL;STRANDS=+-:27;SVLEN=-230;END=3103;CIPOS=-10,170;CIEND=-224,9;CIPOS95=-2,46;CIEND95=-83,1;IMPRECISE;SU=27;PE=27;SR=0  GT:SU:PE:SR ./.:27:27:0
chrM    8233    5   N   <DEL>   .   .   SVTYPE=DEL;STRANDS=+-:6;SVLEN=-139;END=8372;CIPOS=-10,137;CIEND=-138,9;CIPOS95=-2,35;CIEND95=-46,2;IMPRECISE;SU=6;PE=6;SR=0 GT:SU:PE:SR ./.:6:6:0
chrM    10370   6   N   <DEL>   .   .   SVTYPE=DEL;STRANDS=+-:30;SVLEN=-196;END=10566;CIPOS=-10,157;CIEND=-195,9;CIPOS95=-2,52;CIEND95=-68,1;IMPRECISE;SU=30;PE=30;SR=0 GT:SU:PE:SR ./.:30:30:0
chrM    11496   7   N   <DEL>   .   .   SVTYPE=DEL;STRANDS=+-:26;SVLEN=-181;END=11677;CIPOS=-10,143;CIEND=-127,9;CIPOS95=-2,29;CIEND95=-26,1;IMPRECISE;SU=26;PE=26;SR=0 GT:SU:PE:SR ./.:26:26:0
chrM    12383   8   N   <DEL>   .   .   SVTYPE=DEL;STRANDS=+-:21;SVLEN=-170;END=12553;CIPOS=-10,137;CIEND=-161,9;CIPOS95=-2,49;CIEND95=-42,1;IMPRECISE;SU=21;PE=21;SR=0 GT:SU:PE:SR ./.:21:21:0
chrM    12832   9   N   <DEL>   .   .   SVTYPE=DEL;STRANDS=+-:25;SVLEN=-2;END=12834;CIPOS=-9,0;CIEND=0,9;CIPOS95=-9,0;CIEND95=0,4;IMPRECISE;SU=25;PE=25;SR=0    GT:SU:PE:SR ./.:25:25:0

$ grep -v "##" SRR622457.10x.exclude_LCR_only.vcf | grep -v "SR=0"
#CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  SRR622457
$ 

I am not sure what is wrong. I running lumpyexpress with default parameters. Is it too stringent? Do you have any idea about the cause? Thanks.

ryanlayer commented 3 years ago

All of those alignments have MAPQ=0, which lumpy ignores. There is a cmd line setting for mapq, but zero will give you a ton of false positives.

On Dec 18, 2020, at 8:56 AM, yasin-uzun notifications@github.com wrote:

129

yasin-uzun commented 3 years ago

Sorry, I should have chosen better examples. Please see these:

$ samtools view SRR622457.filt.10x.samblaster.splitters.sorted.bam | grep -v chrM | cut -f 1-9,12-16 | awk '$5>30 && $7=="="' | head
SRR622457.427566_2  177 chr1    1037455 60  27S74M  =   1039555 2128    NM:i:0  MD:Z:74 AS:i:74 XS:i:37 SA:Z:chr20,25364573,-,7S38M56S,16,1;
SRR622457.576314_2  145 chr1    1530687 60  46S55M  =   1530175 -567    NM:i:0  MD:Z:55 AS:i:55 XS:i:0  SA:Z:chr1,1530323,-,61M40S,20,4;
SRR622457.1307592981_1  113 chr1    1648453 58  34S67M  =   93938256    92289838    NM:i:1  MD:Z:40T26  AS:i:62 XS:i:42 SA:Z:chr12,123935505,-,32M69S,0,0;
SRR622457.658268_2  177 chr1    1769650 60  29S72M  =   1769230 -392    NM:i:2  MD:Z:19T11T40   AS:i:62 XS:i:22 SA:Z:chr1,1769005,+,67S34M,0,0;
SRR622457.1361288611_1  81  chr1    1880374 60  37S64M  =   1880374 -64 NM:i:2  MD:Z:7T4T51 AS:i:54 XS:i:31 SA:Z:chr12,51150734,-,2S13M1I35M50S,0,3;
SRR622457.4756736_2 129 chr1    1999244 53  44S57M  =   12664761    10665518    NM:i:3  MD:Z:21A20T1T12 AS:i:42 XS:i:22 SA:Z:chr1,12665085,-,57S44M,19,2;
SRR622457.1307594374_1  97  chr1    2053326 46  34S67M  =   2055830 2605    NM:i:6  MD:Z:10C22G5A1C11G11G1  AS:i:40 XS:i:19 SA:Z:chr1,2055691,+,65M36S,14,5;
SRR622457.1307601762_2  145 chr1    2629209 32  33S68M  =   2587227 -42050  NM:i:2  MD:Z:36C14C16   AS:i:58 XS:i:46 SA:Z:chr1,2617094,-,5S50M46S,0,3;
SRR622457.1307601772_2  145 chr1    2629209 31  34S67M  =   2585165 -44111  NM:i:3  MD:Z:36C7T6C15  AS:i:52 XS:i:40 SA:Z:chr1,2622430,-,54M47S,2,5;
SRR622457.966269_2  145 chr1    2764484 60  63M38S  =   2764383 -164    NM:i:2  MD:Z:3G3A55 AS:i:55 XS:i:20 SA:Z:chr1,167185549,-,56S45M,60,0;

There are about 14.5 K split reads with qual > 30 . Is that too small? Or some other issue I have? Thanks.

ryanlayer commented 3 years ago

Can you plot a few of your SVs to see if there should be SR support? IGV or samplot (shameless self promotion, https://github.com/ryanlayer/samplot) are good options.

On Fri, Dec 18, 2020 at 9:22 AM yasin-uzun notifications@github.com wrote:

Sorry, I should have chosen better examples

$ samtools view SRR622457.filt.10x.samblaster.splitters.sorted.bam | grep -v chrM | cut -f 1-9,12-16 | awk '$5>30 && $7=="="' | head SRR622457.427566_2 177 chr1 1037455 60 27S74M = 1039555 2128 NM:i:0 MD:Z:74 AS:i:74 XS:i:37 SA:Z:chr20,25364573,-,7S38M56S,16,1; SRR622457.576314_2 145 chr1 1530687 60 46S55M = 1530175 -567 NM:i:0 MD:Z:55 AS:i:55 XS:i:0 SA:Z:chr1,1530323,-,61M40S,20,4; SRR622457.1307592981_1 113 chr1 1648453 58 34S67M = 93938256 92289838 NM:i:1 MD:Z:40T26 AS:i:62 XS:i:42 SA:Z:chr12,123935505,-,32M69S,0,0; SRR622457.658268_2 177 chr1 1769650 60 29S72M = 1769230 -392 NM:i:2 MD:Z:19T11T40 AS:i:62 XS:i:22 SA:Z:chr1,1769005,+,67S34M,0,0; SRR622457.1361288611_1 81 chr1 1880374 60 37S64M = 1880374 -64 NM:i:2 MD:Z:7T4T51 AS:i:54 XS:i:31 SA:Z:chr12,51150734,-,2S13M1I35M50S,0,3; SRR622457.4756736_2 129 chr1 1999244 53 44S57M = 12664761 10665518 NM:i:3 MD:Z:21A20T1T12 AS:i:42 XS:i:22 SA:Z:chr1,12665085,-,57S44M,19,2; SRR622457.1307594374_1 97 chr1 2053326 46 34S67M = 2055830 2605 NM:i:6 MD:Z:10C22G5A1C11G11G1 AS:i:40 XS:i:19 SA:Z:chr1,2055691,+,65M36S,14,5; SRR622457.1307601762_2 145 chr1 2629209 32 33S68M = 2587227 -42050 NM:i:2 MD:Z:36C14C16 AS:i:58 XS:i:46 SA:Z:chr1,2617094,-,5S50M46S,0,3; SRR622457.1307601772_2 145 chr1 2629209 31 34S67M = 2585165 -44111 NM:i:3 MD:Z:36C7T6C15 AS:i:52 XS:i:40 SA:Z:chr1,2622430,-,54M47S,2,5; SRR622457.966269_2 145 chr1 2764484 60 63M38S = 2764383 -164 NM:i:2 MD:Z:3G3A55 AS:i:55 XS:i:20 SA:Z:chr1,167185549,-,56S45M,60,0;

There are about 14.5 K split reads with qual > 30 . Is that too small? Or some other issue I have? Thanks.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/arq5x/lumpy-sv/issues/350#issuecomment-748185684, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAEUGUNXPW53JU37IX7WHNTSVN6SVANCNFSM4U63PD7Q .

yasin-uzun commented 3 years ago

Thank you. Samplot seems like a great tool. I am glad you have mentioned it. Please see three example SVs below.

chr1    11682892    23  N   <DEL>   .   .   SVTYPE=DEL  STRANDS=+-:10   SVLEN=-297  END=11683189    CIPOS=-10,236   CIEND=-189,9    CIPOS95=0,93    CIEND95=-50,1   IMPRECISE   SU=10   PE=10   SR=0    GT:SU:PE:SR ./.:10:10:0
chr1    16151893    24  N   <DEL>   .   .   SVTYPE=DEL  STRANDS=+-:7    SVLEN=-3546 END=16155439    CIPOS=-10,246   CIEND=-293,9    CIPOS95=0,95    CIEND95=-120,0  IMPRECISE   SU=7    PE=7    SR=0    GT:SU:PE:SR ./.:7:7:0
chr1    16890603    25  N   <DUP>   .   .   SVTYPE=DUP  STRANDS=-+:5    SVLEN=3257  END=16893860    CIPOS=-187,9    CIEND=-10,144   CIPOS95=-44,2   CIEND95=-2,64   IMPRECISE   SU=5    PE=5    SR=0    GT:SU:PE:SR ./.:5:5:0

And here are the samplots with the same order:

chr1_11682892_11683189 chr1_16151893_16155439 chr1_16890603_16893860

Since this is public data, I will be happy to share my bam/vcf files. If you want to take a look, please let me know and I can send a link to you. Thanks.

ryanlayer commented 3 years ago

It looks like like your images didn’t make it

On Dec 18, 2020, at 1:31 PM, yasin-uzun notifications@github.com wrote:

 Thank you. Samplot seems like a great tool. I am glad you have mentioned it. Please see three example SVs below.

chr1 11682892 23 N . . SVTYPE=DEL STRANDS=+-:10 SVLEN=-297 END=11683189 CIPOS=-10,236 CIEND=-189,9 CIPOS95=0,93 CIEND95=-50,1 IMPRECISE SU=10 PE=10 SR=0 GT:SU:PE:SR ./.:10:10:0 chr1 16151893 24 N . . SVTYPE=DEL STRANDS=+-:7 SVLEN=-3546 END=16155439 CIPOS=-10,246 CIEND=-293,9 CIPOS95=0,95 CIEND95=-120,0 IMPRECISE SU=7 PE=7 SR=0 GT:SU:PE:SR ./.:7:7:0 chr1 16890603 25 N . . SVTYPE=DUP STRANDS=-+:5 SVLEN=3257 END=16893860 CIPOS=-187,9 CIEND=-10,144 CIPOS95=-44,2 CIEND95=-2,64 IMPRECISE SU=5 PE=5 SR=0 GT:SU:PE:SR ./.:5:5:0 And here are the samplots:

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

yasin-uzun commented 3 years ago

Thank you for the reply, but sorry, I think I couldn't get it. Do you mean that split reads didn't provide any evidence for these events?