mkirsche / Jasmine

Jasmine: SV Merging Across Samples
MIT License
175 stars 16 forks source link

Corrupted VCF when first VCF is empty #20

Closed wdecoster closed 3 years ago

wdecoster commented 3 years ago

Hi,

I reproducibly encountered corrupted VCF headers (example below) which baffled me for a while, but I think I found a clue: if the first VCF in the file_list argument has no variants (but an otherwise intact header) the merged file from jasmine ends up corrupted.

These files are from CuteSV SV calling and have been processed with iris. Files can be empty as this is just a small genomic locus I am calling in.

##FORMAT=<ID=GT,Number=1,Type=String,Description="The genotype of the variant">
##FORMAT=<ID=IS,Number=1,Type=String,Description="Whether or not the variant call was marked as specific due to high read support and length">
##FORMAT=<ID=OT,Number=1,Type=String,Description="The original type of the variant">
##FORMAT=<ID=DV,Number=1,Type=String,Description="The number of reads supporting the variant sequence">
##FORMAT=<ID=DR,Number=1,Type=String,Description="The number of reads supporting the reference sequence">
chr11   19197368        1_cuteSV.DEL.0  CCCTCCCTCCCTCCCTCCCTCCCTTCCTTCCTTCCTTCCTTCCTTC  C       162.5   PASS    PRECISE;SVTYPE=DEL;SVLEN=-45;END=19197413;CIPOS=-1,1;CILEN=-1,1;RE=24;RNAMES=03f165e6-ce6e-4878-8dad-ded5f3e01672,0d6393b1-1e0a-44ba-a870-6d2f1aa0faff,c8d946f1-41fa-4861-a416-38cee5dd0c6b,51639ac1-a885-4d04-b695-9090d48253f4,2135924b-f991-4fdf-80f7-e501d5c53a08,0e220f01-b9fb-41e6-8934-8fbdd263dbfc,813c72e8-b1cf-4a73-baff-044c15914da7,b08bc8ca-7dbd-4118-b188-672595fe18c8,b153c406-3511-44fd-822c-992e8679ddf4,d58f6f58-a198-4b90-9b94-8d7c3a14f243,e48aaa29-7fc2-4dbb-994d-6444d84d8ed9,486f270f-4169-4421-a8bc-7764f7829a0d,15e699b4-90c5-4b0a-869d-0fb88a8c0255,f3ead806-bcf1-405b-9067-0a8c6336c0a2,835bb96f-ff70-4f0a-bb4d-b0ca828cfc32,11504736-369a-493e-a239-54aaf176b140,165bbeab-666f-4fee-b57e-60338adb0892,19ba3300-9cf7-476d-ad8f-f8b8f3c119d6,0d806d4c-7696-495a-9098-d7858320850c,6dfeb9e0-8664-4d56-bc00-caabfd453e88,cc6ca9fe-e42a-4bc8-98b8-b76f76e53c00,27356926-be53-4540-b3e4-632db18112ab,6ffca776-54bb-42b2-9b91-bb8481eb6022,17f570c0-7393-43cb-a007-f99d6303ac34;STRAND=+-;IRIS_PROCESSED=1;IRIS_REFINED=0;STARTVARIANCE=0.000000;ENDVARIANCE=0.000000;AVG_LEN=-45.000000;AVG_START=19197368.000000;AVG_END=19197413.000000;SUPP_VEC_EXT=010000000000;IDLIST_EXT=cuteSV.DEL.0;SUPP_EXT=1;SUPP_VEC=010000000000;SUPP=1;SVMETHOD=JASMINE;IDLIST=cuteSV.DEL.0  GT:IS:OT:DV:DR  ./.:NA:NA:NA:NA 1/1:.:DEL:24:7  ./.:NA:NA:NA:NA ./.:NA:NA:NA:NA ./.:NA:NA:NA:NA ./.:NA:NA:NA:NA ./.:NA:NA:NA:NA ./.:NA:NA:NA:NA ./.:NA:NA:NA:NA ./.:NA:NA:NA:NA ./.:NA:NA:NA:NA ./.:NA:NA:NA:NA
chr11   19073317        2_cuteSV.DEL.0  AATATTGATTGTGCACTCCTTGCTACTTGATAGGCCTTGTTTTAGGTGCTGGAACTGGCACACTGGGTGAGGTCTGAAATTCAGCTCCTGACTCCTGGCCCAGTGCTCTTCCCAACAGATTCCCATCCCCACCTACCTCACCCAGAGCCCAGCCTCCTGGATATAGGTCAGGGAAGCAGTGCCCAGAGCACCACCTTCCCCTCAGGACCCTGATAGGATATTGCCTGCAGCATTTTCCAGCAGGCACAAAGGGCTGAGAGGTACATCTTTGCAGTTTCCTTCTTCCCCTAGGACATTCTGGGAACCACAGGGATATGCACTACCCTTTCCTTCCTTTTGCAGGGTGGCGACAGGGACCAGGGCCACACTCACTGGGCCTAGGGGCAACCCTGGTTCAGACACTCCAGGCTACATGACCTCAGATGACTCCTGTAACCCCCTGAAGCCTTGGTTTCCTCATCTCTAAAAAGGTGATAGTGATACAGGAACTAGAAAGAAATTATTTAGGCAGATAGTGAGGGTAAGAGAGTCCTCAGTAAGGTTTCCTTTTAATAAAAAGCAGCCCCTAACTTGTTTCTTTTCTAAGAAAAAGCAACCTGAAAAATCAAGCTGCAAGCATAGATAAGCAAGCTAAAAGCTCACATAGGTAAATACTGGCAGCTGTGGCAATAGAAAAGCGATATCTGGAAGCCAGGTATATTCAACACGGAGGTTCCCTCTTCCCTTTCCTTTGTCACCACATGTGCAGTAAAAAGCAGGCAACATGGCAC      A       32.8    PASS    PRECISE;SVTYPE=DEL;SVLEN=-769;END=19074086;CIPOS=-1,1;CILEN=-1,1;RE=12;RNAMES=fa78ce1b-eb7f-4e62-8d44-5b9754533442,b2f18d6a-be6b-4d31-b387-f35ec9d1b1ef,361eab6d-7051-4345-9d3b-016d9c7b8ce6,28acb24d-a2ea-43f2-bef9-430e35f2f51d,812548f4-be03-4154-827d-a7e1aa0da954,871a0736-f0a6-44c9-a1b4-30fbc4b91a29,b8e0fd9e-717b-472a-a2c3-103e619bf385,be441563-02c0-42bd-b357-f7b6c48b414f,d412fbe0-38c4-4f87-9b93-3238a245a209,e7a2525e-697d-4e3d-8de6-4765e758350e,dca637c4-dec6-4d93-8e1a-bd16b9fe6ada,275d9909-9f80-40d2-8433-9666dc0111b5;STRAND=+-;IRIS_PROCESSED=1;IRIS_REFINED=0;STARTVARIANCE=0.000000;ENDVARIANCE=0.000000;AVG_LEN=-769.000000;AVG_START=19073317.000000;AVG_END=19074086.000000;SUPP_VEC_EXT=001000000000;IDLIST_EXT=cuteSV.DEL.0;SUPP_EXT=1;SUPP_VEC=001000000000;SUPP=1;SVMETHOD=JASMINE;IDLIST=cuteSV.DEL.0    GT:IS:OT:DV:DR  ./.:NA:NA:NA:NA ./.:NA:NA:NA:NA 0/1:.:DEL:12:20 ./.:NA:NA:NA:NA ./.:NA:NA:NA:NA ./.:NA:NA:NA:NA ./.:NA:NA:NA:NA ./.:NA:NA:NA:NA ./.:NA:NA:NA:NA ./.:NA:NA:NA:NA ./.:NA:NA:NA:NA ./.:NA:NA:NA:NA
chr11   19133944        1_cuteSV.INS.0  C       CATATATATATATATATATATATATATACAC 297.1   PASS    PRECISE;SVTYPE=INS;SVLEN=30;END=19133944;CIPOS=-1,1;CILEN=-0,0;RE=52;RNAMES=02221b02-eee5-4518-8adc-2e8565f7eb9e,11f677cd-0547-4542-97d3-b4757e5f1d8e,1a2bee2a-f5f5-4a86-a8a7-fa2483180653,1a54c7c4-b281-4c73-8390-24948a4cd3ab,25fb91e5-75c3-443f-91f9-5ce182ba574f,39889763-bb2f-45f4-bc67-7850a5ebcf9d,3a52eca4-1304-4d2b-a6a1-d9df904eba2e,3de5150f-f6bc-49f5-a91d-051dee11fdd7,40687879-70ee-4551-8919-ea8e69489405,45f8979d-d899-414b-a231-33c747cb275b,515f3861-581c-45bd-9ffe-d27dab099f92,54f72ab5-7dd0-4aa6-aa07-7563b7044a4f,5998cad6-0c12-4d3a-89eb-3bc0fad178b1,5a0c6cc2-acac-4f09-af2b-4fa25a06f2b4,5cc6bf1e-0ae4-419b-bcbb-e35d4239f8f7,5ce9bedd-f37a-4a25-9078-b3c3d560b787,6dcbbf51-96cf-491a-970c-b1e154e68e92,7121883b-9249-46df-9e19-76829fea8a71,87c091be-3b97-4158-b4a2-cff359848917,908fe6fa-a765-40bc-8ee5-9844dd143ad6,92f26169-3026-41d3-af72-5da0a7f88417,9c36ffbe-94f3-4a1a-999d-ed16d13ded35,c4ce6b77-ada9-469e-a2ca-7b62c57eae0e,c966d4be-6178-4658-95e2-2c85b5927aa6,cc04f9cd-184d-4936-a29b-e07662c2388a,cc6ca9fe-e42a-4bc8-98b8-b76f76e53c00,d40b94d0-c0ad-4718-bfc1-388163b717ad,d7edc355-ffdd-490b-bfb8-571ce32f33e6,da307122-9d3b-443c-a376-798165478e7a,db392384-f53b-44fc-b09c-9f6ccb21027a,eae91963-f8ad-492b-bc13-2e25b3890d76,276106ab-4795-4df8-b33c-bde16f088bd5,517e7258-ea50-46f2-a82f-8a17be1d8765,6e39f1ad-4d8c-44d9-bfd8-4e3e6672f38d,0fb60110-3956-4933-8fcd-82063cecd52b,946e9c4f-dbfa-4849-a65e-c05fef867a95,1e8dd61d-c7b8-431f-bd74-3986e9d7c568,418b6332-60d0-499c-abad-5e0da48795b8,5d5e2f00-f2e8-4371-bbb1-e48ac35dd242,6dec7bc8-269e-46d4-b94d-907eb8109b0e,8722574f-78aa-4605-9029-90767c859c75,afa118d5-a7bb-4f92-aa34-6eaf04d6bdc7,c99bdad8-dec2-4316-aa4e-3d587328f14b,af7f2a7b-2997-41ce-ac39-096ba0c7c170,26b82949-0260-4d91-951f-e191e7724706,4da6b0e0-6b85-4762-95b1-09466732780f,8b67bbad-8d4a-4689-a537-826c18330ea7,ccf9fae1-10f0-452f-8479-fd1b154a0d34,8856fa2b-a0b5-4fc3-8abd-3c9ae1be02f5,389a4e9c-55bc-478e-a190-e67d1fb0189e,a512ef9b-e691-4cc6-bd67-9ed839b7c595,16eccd79-5064-4155-829e-ab0779e017ac;IRIS_PROCESSED=1;IRIS_REFINED=1;STARTVARIANCE=0.000000;ENDVARIANCE=0.000000;AVG_LEN=31.900000;AVG_START=19133944.000000;AVG_END=19133944.000000;SUPP_VEC_EXT=010111111111;IDLIST_EXT=cuteSV.INS.0,cuteSV.INS.0,cuteSV.INS.0,cuteSV.INS.0,cuteSV.INS.0,cuteSV.INS.0,cuteSV.INS.0,cuteSV.INS.0,cuteSV.INS.0,cuteSV.INS.0;SUPP_EXT=10;SUPP_VEC=010111111111;SUPP=10;SVMETHOD=JASMINE;IDLIST=cuteSV.INS.0,cuteSV.INS.0,cuteSV.INS.0,cuteSV.INS.0,cuteSV.INS.0,cuteSV.INS.0,cuteSV.INS.0,cuteSV.INS.0,cuteSV.INS.0,cuteSV.INS.0      GT:IS:OT:DV:DR  ./.:NA:NA:NA:NA 0/1:.:INS:52:26 ./.:NA:NA:NA:NA 0/0:.:INS:13:40 0/1:.:INS:18:31 0/1:.:INS:16:14 0/0:.:INS:10:31 0/1:.:INS:10:22 0/0:.:INS:10:31 0/1:.:INS:14:25 0/1:.:INS:18:36 0/1:.:INS:11:13

Cheers, Wouter

mkirsche commented 3 years ago

Hi Wouter,

Thank you for bringing this to my attention! I have resolved this issue in the latest commit and it will be a part of the next release.

Thanks! Melanie

wdecoster commented 3 years ago

Hi Melanie,

Thank you! Do you intend to make this release soon, or do you recommend installing from the repo to work around this?

Cheers, Wouter

mkirsche commented 3 years ago

Hi Wouter,

I just made the new release, so it should be available via bioconda within the next day or so.

Best, Melanie

mkirsche commented 3 years ago

Hi Wouter,

I'm going to go ahead and close this issue since the updated version is now live on bioconda. Please feel free to reach out if you continue to have problems when using version 1.1.3!

Thanks, Melanie