Closed JYLeeBioinfo closed 5 years ago
Thanks for bring this up. Back when I implemented that mode I was not really aware of the standard to indicate the phasing in the GT column. Thus, I indicated it by the variant ID. So two events are on the same haplotype if they share the same variant ID. They are still unique IDs since I attach a _1 or _2 and so on.
I will reassign this to an enhancement. Thanks Fritz
Thank you for your reply and sorry for my late response.
I tried with new data with longer read lengths and got the SV IDs with _0, _1, _2 and so on.
However, I found that duplicate IDs exist for ~30 phased events. (not for unphased SVs)
How should I interpret those events?
The sniffles command I used is as follows
sniffles --report_BND -s 8 -n -1 -t 80 -m $BAM_filt -v test.vcf --max_distance 1000 --minmapping_qual 30 --genotype --cluster --report_seq --report_read_strands
Here are some examples I found. As of the first example, I could not find the 14647_0 event
chr21 10692365 14647_1 CCATTCCATTCCATTCCATTCCATTCCATTCCATTC N . PASS PRECISE;SVMETHOD=Snifflesv1.0.11;CHR2=chr21;END=10692401;STD_quant_start=3.193744;STD_quant_stop=3.962323;Kurtosis_quant_start=-0.415687;Kurtosis_quant_stop=0.085988;SVTYPE=DEL;RNAMES=1bae3244-8d9b-4956-bc62-e457ed383d7b,1cd2fe57-9e04-4489-b748-6af2c1c74c55,2626cb93-345e-4b8d-8b93-794616c97dbd,42a8d08c-98eb-41c2-9c16-a9a913bfd196,50408290-4af8-4060-9b8f-a52abb1d7678,56cb85df-b2d2-42e1-b93b-db5583cbe919,611db715-60cf-4c72-afab-0aa844386f0e,801b5734-28ca-47e9-9040-61933de402fb,8df87b11-5e8b-4fec-bc33-58f0a3ff919f,91d54902-48ac-4aa1-8bfa-a3723713034f,a1caa5ff-31a0-42e7-9e39-2e0d45aa98b2,a59019dd-6114-46f6-a7d8-bad9338f2332,c49a7b74-07d8-4bd0-9b5e-b94a33a31dd0,d429fa79-e8be-405f-9968-48463d1e8b3d,dbd5e813-dc7b-48ba-9ce0-c4cfcf2d91e8;SUPTYPE=AL;SVLEN=-36;STRANDS=+-;STRANDS2=15,0,15,0;RE=15;REF_strand=0,0;AF=1 GT:DR:DV 1/1:0:15
chr21 10692670 14647_1 CCGTTCCATTCCATTCCATTCCAGTCCATTCCACTGGAGTCCATTCCATTCCATTCCATTCCATTCCATTCCATTGCATTCCATTCCATTCCATTCCATTCCATTCCATTCCATTGCATTCCATTCCATTCCATTCCATTCCATTCCATTCCATTCCATTGCATTCCAT N . PASS IMPRECISE;SVMETHOD=Snifflesv1.0.11;CHR2=chr21;END=10692881;STD_quant_start=5.686241;STD_quant_stop=27.010800;Kurtosis_quant_start=1.941410;Kurtosis_quant_stop=-0.302489;SVTYPE=DEL;RNAMES=1bae3244-8d9b-4956-bc62-e457ed383d7b,1cd2fe57-9e04-4489-b748-6af2c1c74c55,2626cb93-345e-4b8d-8b93-794616c97dbd,365e99f5-7409-4801-bad4-b4a6373ed3e7,42a8d08c-98eb-41c2-9c16-a9a913bfd196,452195e3-e47e-454c-a349-a4bc524a7342,4d09d086-686d-438b-85a7-223c2ff01d4e,50408290-4af8-4060-9b8f-a52abb1d7678,56cb85df-b2d2-42e1-b93b-db5583cbe919,611db715-60cf-4c72-afab-0aa844386f0e,801b5734-28ca-47e9-9040-61933de402fb,86326ef1-abc6-4643-9739-078a61279f27,91d54902-48ac-4aa1-8bfa-a3723713034f,a1caa5ff-31a0-42e7-9e39-2e0d45aa98b2,a5691e42-e7b0-4308-8c8a-425a113b2f35,a59019dd-6114-46f6-a7d8-bad9338f2332,b6bd2e74-6b09-4199-aca3-1141197c4ec1,b9736eb6-602b-4c48-ac2f-5609bf85a885,c49a7b74-07d8-4bd0-9b5e-b94a33a31dd0,d429fa79-e8be-405f-9968-48463d1e8b3d,e1598512-5f87-4d8a-99cc-0d9ddb217a87,e276f1ba-433c-4674-b547-0fe0a9ba5360,ec9de94f-ba40-480e-9dc1-373f3a58ef6a,feda2f3f-41eb-41f5-9f5a-1e2fa4197dbf;SUPTYPE=AL;SVLEN=-211;STRANDS=+-;STRANDS2=23,1,23,1;RE=24;REF_strand=0,0;AF=1 GT:DR:DV 1/1:0:24
chr21 10693543 14647_2 TGCATTCCATTCCATTCCATTCCATTCCATTC N . PASS IMPRECISE;SVMETHOD=Snifflesv1.0.11;CHR2=chr21;END=10693575;STD_quant_start=14.866069;STD_quant_stop=12.489996;Kurtosis_quant_start=0.091876;Kurtosis_quant_stop=0.790872;SVTYPE=DEL;RNAMES=452195e3-e47e-454c-a349-a4bc524a7342,56cb85df-b2d2-42e1-b93b-db5583cbe919,86326ef1-abc6-4643-9739-078a61279f27,91d54902-48ac-4aa1-8bfa-a3723713034f,a1caa5ff-31a0-42e7-9e39-2e0d45aa98b2,a5691e42-e7b0-4308-8c8a-425a113b2f35,b9736eb6-602b-4c48-ac2f-5609bf85a885,d429fa79-e8be-405f-9968-48463d1e8b3d,ec9de94f-ba40-480e-9dc1-373f3a58ef6a;SUPTYPE=AL;SVLEN=-32;STRANDS=+-;STRANDS2=9,0,9,0;RE=9;REF_strand=0,0;AF=1 GT:DR:DV 1/1:0:9
chr21 10698432 14647_3 TTCCATTCGGTTCCATTCCCTTTCATTCCATTTGAGTGCATTCCATTCCATTCCA N . PASS PRECISE;SVMETHOD=Snifflesv1.0.11;CHR2=chr21;END=10698487;STD_quant_start=7.648529;STD_quant_stop=7.648529;Kurtosis_quant_start=2.384129;Kurtosis_quant_stop=2.502626;SVTYPE=DEL;RNAMES=1bae3244-8d9b-4956-bc62-e457ed383d7b,2626cb93-345e-4b8d-8b93-794616c97dbd,7301b428-1e58-4ab0-8ccd-99e4c8308538,7edb1f48-a6de-49a1-9922-4ab0f0a9aba6,86326ef1-abc6-4643-9739-078a61279f27,91d54902-48ac-4aa1-8bfa-a3723713034f,b38432c8-d49d-43fc-b473-b41d214c51fb,b4fe543d-6428-4813-a3d4-f64c9ef7924b,b9736eb6-602b-4c48-ac2f-5609bf85a885,d429fa79-e8be-405f-9968-48463d1e8b3d,efb464a0-41e7-466e-8b0b-7c16f4a6022c,f782eb3d-452c-47ad-9422-1a99307653a0;SUPTYPE=AL;SVLEN=-55;STRANDS=+-;STRANDS2=11,1,11,1;RE=12;REF_strand=6,0;AF=0.666667 GT:DR:DV 0/1:6:12
chr17_GL000205v2_random 10457 15771_0 N GGAATCCTGAGGACAAACTTCAGAACCTCTTGGTGTTCTGGAAGTATGTGAGGACACACACTCAGACCACCCTGCATGGTGATCTGGGAATCCTATGTGAGGACAAACACTCAGAACTGGCAAGTGTTT . PASS IMPRECISE;SVMETHOD=Snifflesv1.0.11;CHR2=chr17_GL000205v2_random;END=10535;STD_quant_start=25.865034;STD_quant_stop=33.962053;Kurtosis_quant_start=-1.739023;Kurtosis_quant_stop=-1.970055;SVTYPE=INS;RNAMES=05063a45-faa9-4bf4-a645-f507a6e36c3a,0a1ac840-c184-4a69-8b38-b37d7a029860,12ce8e85-285f-4403-8fe1-00cf7c0d7556,1407abdf-75f0-43e0-9f08-5a75a361b5da,162b1e0a-be13-415d-b81d-5cbfa772a2ff,2dcc4af7-cd24-48cb-b558-ada1ea8d6ac0,2ff8db9e-196c-439e-96c4-bf2473b2a7eb,38a77f09-6542-4873-af1a-c21257d4e7f1,3b0c4ff0-02e6-4b1c-a4aa-034ee8652b6e,408a09e3-b964-411e-b610-fe9f0d36ee30,47148fbe-9cf0-4790-9267-717e3d2737ca,483315fc-b766-4882-9ee0-66c20bead3cd,547602ae-a9e0-448d-974e-11bf2cef8388,550d3f89-6ee1-4a99-bf58-dea799ce6bd0,5c7f3e80-aed4-4cce-9270-b12781cb38a5,658d4e03-3722-43d2-b521-11819398d728,6a7a2b71-1686-48fa-a948-b8dfe99120b9,71f1af9d-6c80-4b93-8218-5ca29e191206,7344e588-aab9-42b0-bf9e-541a26e0e87f,7cfef975-ea51-455c-af17-26cc31457f1f,814eb71b-e57a-4d8f-9ec4-7310dc04ec85,95151ad8-3044-4b00-9105-ad7212872f13,a47174ea-2040-44d2-8ce6-d026d567ed5a,a7b579be-2d5b-4da0-bdcd-3000373e407c,a7fa006f-8ec9-4265-8a56-bcbfc089f7ef,ad31ca45-4167-4868-af75-3d4121a6d961,b2b85933-489c-4d22-bbe9-a860ed13ea87,b7c391fc-c7f6-42fe-bd5f-d85a80e4cb0d,bf3e6968-fbcb-40a7-9599-70a3bca67680,c081090a-32ee-4a62-b6f0-27a788e3cc9c,ccf1c4de-5345-4c05-b705-1af342fffd82,cf5ced83-3913-4de0-be32-760ced460282,cf9a6ad0-62df-4cb7-84cc-55443981aa6a,cfa3c1f6-1fd4-4c0d-ab4f-157065a8bf24,de265d79-2b6f-46a9-9717-ba32a43c43b2,e45480ff-9537-411f-8994-6bb486ab98bc,e4b1f83c-8c0b-423e-9429-9e38c3f68031,f21bc955-308e-4923-8a9b-7305d810e6ff,f23b109f-684a-4ba8-b67d-59cd4938e42c,fd485805-2b49-46e3-9da8-9c1d58169969;SUPTYPE=AL;SVLEN=138;STRANDS=+-;STRANDS2=20,20,20,20;RE=38;REF_strand=17,17;AF=0.527778 GT:DR:DV 0/1:34:38
chr17_GL000205v2_random 11421 15771_1 N GTGTGTGAGGACAAAGACCAGACCCTGGTAGAAGTGGTACCTAAATCCT . PASS IMPRECISE;SVMETHOD=Snifflesv1.0.11;CHR2=chr17_GL000205v2_random;END=11465;STD_quant_start=13.419274;STD_quant_stop=13.733732;Kurtosis_quant_start=-0.640978;Kurtosis_quant_stop=-0.347896;SVTYPE=INS;RNAMES=12ce8e85-285f-4403-8fe1-00cf7c0d7556,162b1e0a-be13-415d-b81d-5cbfa772a2ff,1922d7e7-1c23-467e-bcf6-186e0a8c0cbc,1a253b33-405f-4348-8da3-12862cb5cc9c,2da442b7-275e-4c1e-b3c3-ab199539e8bb,2dcc4af7-cd24-48cb-b558-ada1ea8d6ac0,2ff8db9e-196c-439e-96c4-bf2473b2a7eb,38a77f09-6542-4873-af1a-c21257d4e7f1,3b0c4ff0-02e6-4b1c-a4aa-034ee8652b6e,408a09e3-b964-411e-b610-fe9f0d36ee30,483315fc-b766-4882-9ee0-66c20bead3cd,4c6663eb-9ec5-4c6d-b83d-d38fa8ebf1e6,550d3f89-6ee1-4a99-bf58-dea799ce6bd0,5b40466e-c2f4-4eab-bb2f-756b3ac5dec2,5c7f3e80-aed4-4cce-9270-b12781cb38a5,6a7a2b71-1686-48fa-a948-b8dfe99120b9,7344e588-aab9-42b0-bf9e-541a26e0e87f,87f64880-49a1-49cc-84c8-917d5230eba1,95151ad8-3044-4b00-9105-ad7212872f13,a7fa006f-8ec9-4265-8a56-bcbfc089f7ef,ad31ca45-4167-4868-af75-3d4121a6d961,b2b85933-489c-4d22-bbe9-a860ed13ea87,b7c391fc-c7f6-42fe-bd5f-d85a80e4cb0d,e45480ff-9537-411f-8994-6bb486ab98bc,e4b1f83c-8c0b-423e-9429-9e38c3f68031,fd485805-2b49-46e3-9da8-9c1d58169969;SUPTYPE=AL;SVLEN=43;STRANDS=+-;STRANDS2=12,14,12,14;RE=26;REF_strand=3,2;AF=0.83871 GT:DR:DV 1/1:5:26
chr17_GL000205v2_random 12429 15771_2 TATGTGAGGGAGAAACATTCAGACAATCGTATCAGTGTTCCAGAATC N . PASS PRECISE;SVMETHOD=Snifflesv1.0.11;CHR2=chr17_GL000205v2_random;END=12476;STD_quant_start=3.146427;STD_quant_stop=3.114482;Kurtosis_quant_start=-0.912141;Kurtosis_quant_stop=-1.426809;SVTYPE=DEL;RNAMES=12ce8e85-285f-4403-8fe1-00cf7c0d7556,1922d7e7-1c23-467e-bcf6-186e0a8c0cbc,1a253b33-405f-4348-8da3-12862cb5cc9c,1c4ac883-fe26-4a43-bc7a-eccbc518ce1d,2da442b7-275e-4c1e-b3c3-ab199539e8bb,38a77f09-6542-4873-af1a-c21257d4e7f1,3b0c4ff0-02e6-4b1c-a4aa-034ee8652b6e,483315fc-b766-4882-9ee0-66c20bead3cd,4c6663eb-9ec5-4c6d-b83d-d38fa8ebf1e6,53a562b8-65b7-4af2-9e98-64537134a126,5c7f3e80-aed4-4cce-9270-b12781cb38a5,87f64880-49a1-49cc-84c8-917d5230eba1,95151ad8-3044-4b00-9105-ad7212872f13,a7fa006f-8ec9-4265-8a56-bcbfc089f7ef,b7c391fc-c7f6-42fe-bd5f-d85a80e4cb0d,d5f9abca-505d-44b8-bb58-08078498e0c8,e45480ff-9537-411f-8994-6bb486ab98bc,e4b1f83c-8c0b-423e-9429-9e38c3f68031,fd485805-2b49-46e3-9da8-9c1d58169969;SUPTYPE=AL;SVLEN=-47;STRANDS=+-;STRANDS2=10,9,10,9;RE=19;REF_strand=1,1;AF=0.904762 GT:DR:DV 1/1:2:19
chr17_GL000205v2_random 12955 15771_3 TTCTGTGTGAGGGACAAACTTTCAGATGCTCGTAGCAGTGTTCTGGAACTCTGTGTGAGGGACAAACTTTCAGACCCTCGTAGCAGTGTTCTGGAA N . PASS IMPRECISE;SVMETHOD=Snifflesv1.0.11;CHR2=chr17_GL000205v2_random;END=13021;STD_quant_start=13.330416;STD_quant_stop=16.941074;Kurtosis_quant_start=5.622000;Kurtosis_quant_stop=2.409351;SVTYPE=DEL;RNAMES=05063a45-faa9-4bf4-a645-f507a6e36c3a,0a1ac840-c184-4a69-8b38-b37d7a029860,1407abdf-75f0-43e0-9f08-5a75a361b5da,162b1e0a-be13-415d-b81d-5cbfa772a2ff,2dcc4af7-cd24-48cb-b558-ada1ea8d6ac0,2ff8db9e-196c-439e-96c4-bf2473b2a7eb,483315fc-b766-4882-9ee0-66c20bead3cd,550d3f89-6ee1-4a99-bf58-dea799ce6bd0,6a7a2b71-1686-48fa-a948-b8dfe99120b9,7344e588-aab9-42b0-bf9e-541a26e0e87f,7cfef975-ea51-455c-af17-26cc31457f1f,ad31ca45-4167-4868-af75-3d4121a6d961,b2b85933-489c-4d22-bbe9-a860ed13ea87,cf5ced83-3913-4de0-be32-760ced460282;SUPTYPE=AL;SVLEN=-66;STRANDS=+-;STRANDS2=7,7,7,7;RE=14;REF_strand=0,0;AF=1 GT:DR:DV 1/1:0:14
chr17_GL000205v2_random 15571 15771_3 N <DEL> . PASS IMPRECISE;SVMETHOD=Snifflesv1.0.11;CHR2=chr17_GL000205v2_random;END=36711;STD_quant_start=22.939720;STD_quant_stop=9.655608;Kurtosis_quant_start=-1.408083;Kurtosis_quant_stop=1.303410;SVTYPE=DEL;RNAMES=0a1ac840-c184-4a69-8b38-b37d7a029860,0b417bae-016c-4537-b764-209e8221f136,1019827c-3baf-4bb5-8c44-1869d8cef089,1f5c8f65-4482-4012-854a-0de0021e4a64,285c484f-7c10-4d36-95bc-ae4c3c4c2f97,2ff0109e-076b-4ec8-9164-fa4ba08f285f,3077497c-24a6-4a8b-95e6-cdbe320b3f05,42f5dba5-fb22-4d06-895d-eb714191ad45,4ae63d27-a1ca-4561-888d-0a7afe24c32c,5b40466e-c2f4-4eab-bb2f-756b3ac5dec2,5d0cf2fd-c579-4f80-9b22-38f2e85c8b35,640322d2-4c98-4a87-b00b-34f1e8f97197,7344e588-aab9-42b0-bf9e-541a26e0e87f,7c8356a9-5d69-4cc9-9643-8db39e7f0e67,7cfef975-ea51-455c-af17-26cc31457f1f,8f20eeb3-df8e-4698-8d86-d769432a686a,923d544d-e617-4d80-a01d-fe05501447af,9528c511-e164-40b0-b3fd-ade853a21023,b2b85933-489c-4d22-bbe9-a860ed13ea87,bd1da397-38b0-45db-b334-eb44e2b13f9b,d1e44c88-cfbb-41ff-ae67-9804d149a634,d2b44c7c-5265-439e-b471-c0e062a0cd7a,e52dbde6-19ba-4017-9cfe-316a65a4676a,ec5fcf48-ed61-4e29-b055-a45ad189d712,efa2c436-eb07-46c8-9c89-75b0787aaf3b,ff554813-bef9-4524-bea7-14b7c46cf312;SUPTYPE=SR;SVLEN=-21140;STRANDS=+-;STRANDS2=14,12,14,12;RE=26;REF_strand=6,16;AF=0.541667 GT:DR:DV 0/1:22:26
chr17_GL000205v2_random 15736 15771_3 N ATGCCAAACATTTGTAACCCAGTAGCAGCGCTCTGGAATCCCAAGTGAGGA . PASS IMPRECISE;SVMETHOD=Snifflesv1.0.11;CHR2=chr17_GL000205v2_random;END=15778;STD_quant_start=11.691878;STD_quant_stop=9.823441;Kurtosis_quant_start=-1.110318;Kurtosis_quant_stop=-0.924401;SVTYPE=INS;RNAMES=12ce8e85-285f-4403-8fe1-00cf7c0d7556,1922d7e7-1c23-467e-bcf6-186e0a8c0cbc,1a253b33-405f-4348-8da3-12862cb5cc9c,1c4ac883-fe26-4a43-bc7a-eccbc518ce1d,2da442b7-275e-4c1e-b3c3-ab199539e8bb,483315fc-b766-4882-9ee0-66c20bead3cd,4c6663eb-9ec5-4c6d-b83d-d38fa8ebf1e6,53a562b8-65b7-4af2-9e98-64537134a126,5c7f3e80-aed4-4cce-9270-b12781cb38a5,87f64880-49a1-49cc-84c8-917d5230eba1,a7fa006f-8ec9-4265-8a56-bcbfc089f7ef,b7c391fc-c7f6-42fe-bd5f-d85a80e4cb0d,ce039ff5-418a-47e8-a8d5-efbaa0c3057b,d5f9abca-505d-44b8-bb58-08078498e0c8,e45480ff-9537-411f-8994-6bb486ab98bc;SUPTYPE=AL;SVLEN=44;STRANDS=+-;STRANDS2=9,6,9,6;RE=15;REF_strand=0,0;AF=1 GT:DR:DV 1/1:0:15
chr17_GL000205v2_random 19669 15771_4 TCAACCATTCAGACAACAGCAGTAGTGTTCTGCAAGCCTATAAGAGGGAAAAACATTCAGACAACAGCAGGAGTGATCTGGAATCCCATCTGAGGAACAAACATTCAGACCACAGCTGTGGTGTTCTGGAATAGTATGTGAGGGCCAAACACTGAGAACCCAACAGCAGTGTTCAGGAATACTAAGTGAGGGACAAACATTCTGACCACAGCAGGAGTGTCCTGGAATCCTATGTGGGGTAGAATAATTCAGACCCTCGTAGCAGTGTTCTGGAATCCTATGAGAGGAACAAACATTCATACCCCAGTAGCAGTGTTCTAGAATCCTATTTGAGGGACAAACACTCAGACAACAGCAGAAATGTTTTGGAATCATATGTGAGGGAGAAACATTCAGACCACAGCAGGACTGTTCTGGAATCCCATGTGAGGCACAAACACCCAGACCACAGCAGGCGTGTTCTGGAATCCTATGTGAGGGTCAAACATTCAGACCACAGCAGTAGTCTTCTGGAATCCTAGATGAGGGACAAATATTCAGACCCCAGCAGTAGTGTTCAGGAATCCTACATGAGGGACAAACATTTAGAACCCAGTAGCATTGTTCTGGAATCCCATGTGAGGGACAATCATTCAGCCCACAGCTGGTGTGTTCTGGAATCCTATTTGTGGGACAAACATTCAGACCCTCGTAGCATTGTTCTGGAATCCTATGTGAGGAAAATACATTCAGAACACAGCAGGATTGTTCTTCAGTCCTATGTGTGAGGCAAACATTCATACCCTTGTAGCCGTGTTCTGGAGTCCTCTGTGGAGGGCTAACATTAAGACCCACGTAACAGTGTTCCGGCATCGTACGTGAGGATAAACACTCAGAACCCAACAGCAGTGTTCTGGAATCCTAAGTGAGGGACAAACATTAAGATGTCAATACCAGTGTTCTGGAATCCTATTTGAGGGACAAACACTCAGACACAGGAGGAATGTTTTGGAAGCCTATGCGAGGGAGAAACATTCAGACTACAGCAGGATTGTTCTGGAATACCATGTGCGGTAAAAACACACAGACCACAGCAGGATTGTTCTGGAATCCTATGTGAGGGTCAACCATTCAGACCACAGCATCAGGGTTCTGGAATCCTATATGAGTGACAAATATTCAGAACACAGCAAGAGAGTTCTGGGATCCTGTGTGTTGGACAAACTCTCAGAACCCAGCAGCAGTGTTCTGGAATCCTATTTAAGGGGGAAACACCCAGACCAGAGCAGGGATGTTCTGGAATCCTATGTGAGG N . PASS IMPRECISE;SVMETHOD=Snifflesv1.0.11;CHR2=chr17_GL000205v2_random;END=20961;STD_quant_start=10.606602;STD_quant_stop=11.760102;Kurtosis_quant_start=1.632043;Kurtosis_quant_stop=1.397835;SVTYPE=DEL;RNAMES=1922d7e7-1c23-467e-bcf6-186e0a8c0cbc,1a253b33-405f-4348-8da3-12862cb5cc9c,1c4ac883-fe26-4a43-bc7a-eccbc518ce1d,25613d55-fc98-452a-9953-0d0becd23d48,483315fc-b766-4882-9ee0-66c20bead3cd,4c6663eb-9ec5-4c6d-b83d-d38fa8ebf1e6,53a562b8-65b7-4af2-9e98-64537134a126,5c7f3e80-aed4-4cce-9270-b12781cb38a5,87f64880-49a1-49cc-84c8-917d5230eba1,a56576bb-1705-4d12-8138-214f0ebb647e,b7c391fc-c7f6-42fe-bd5f-d85a80e4cb0d,ce039ff5-418a-47e8-a8d5-efbaa0c3057b,ce40e79f-80f4-4282-bf35-36545f425a4a,d5f9abca-505d-44b8-bb58-08078498e0c8,ddcd0ca9-eca8-4d45-ab28-71045aea3a3d,e45480ff-9537-411f-8994-6bb486ab98bc;SUPTYPE=AL,SR;SVLEN=-1292;STRANDS=+-;STRANDS2=9,7,9,7;RE=16;REF_strand=1,4;AF=0.761905 GT:DR:DV 0/1:5:16
These are indicating events on the same cluster (ie. likely to fall onto the same haplotype). Meaning reads are connecting them together. This is indicated by the same id _1, _2,... Cheers Fritz
Thank you for the reply!
But the point I am having trouble in interpretation is that, as to SV cluster 14647, I have two 14647_1. And for the SV cluster 15771, I have three 15771_3.
Is it okay for me to manually assign unique numbers? For example, as to 14647_1, 14647_1_1 and 14647_1_2.
Or should I delete others except for one of them? For example, retain the first 15771_3 and delete the following two 15771_3
With regards Jinyoung
Oh i See what you mean. I will check it out. Sorry I overlooked that Cheers Fritz
I examined one of the phased events in which different SVs were assigned the same ID
Hope this help you enhancing the Sniffles.
With regards Jinyoung
Thanks. Sorry I havent found time yet to improve it. Its high up my list. Thanks Fritz
Regarding this https://github.com/fritzsedlazeck/Sniffles/issues/120#issuecomment-449038837 ,
I found a section in whatshap that well describe how phased genotype are represented. https://whatshap.readthedocs.io/en/latest/guide.html#representation-of-phasing-information-in-vcfs
It seems phased genotype can be represented either in GT:PS or GT:HP.
GT:PS 0|1:group1 1|0:group1 0|1:group2 1|0:group2
GT:HP 0/1:group1-1,group1-2 0/1:group1-2,group1-1 0/1:group2-1,group2-2 0/1:group2-2,group2-1
I hope this information help the enhancement!
FYI: genotype based SV phasing is problematic due to the constant-ploid assumption underpinning the GT field being violated for SVs as well as the complete inability to phase inter-chromosomal events.
VCFv4.4 (current in draft form) will introduce the PSL (Phase Set List) field which will enable (cis) phasing information to be explicitly reported, even for inter-chromosomal events.
See https://github.com/samtools/hts-specs/pull/421 for the current design. Feedback from potential implementors is welcome (and encouraged).
Hello, @fritzsedlazeck
I'm having trouble in using --cluster option of the sniffles
I can't find the variant phasing information in VCF files and BEDPE files
the command I used is as follows
I compared output VCF file with and without cluster option but the output file was exactly the same.
As far as I know, slashes(/) are used in unphased genotypes and pipes(|) are used in phased genotypes, such as 0/1 and 0|1 .
And the genotypes were all in an unphased format.
Am I missing something?? If so, it would be really helpful if you explain to me where to find and how to interpret the variant phase information.
Thank you