genome / analysis-workflows

Open workflow definitions for genomic analysis from MGI at WUSM.
MIT License
102 stars 57 forks source link

filter_vcf_depth.cwl problems handling DNPs? #1081

Open zlskidmore opened 1 year ago

zlskidmore commented 1 year ago

Hey Guys,

I'm seeing the following error for a somatic_wgs run:

2022-11-20 11:58:36,779 cromwell-system-akka.dispatchers.engine-dispatcher-5918 INFO  - WorkflowManagerActor Workflow e1ad4b27-9f38-403d-adc6-a5ddd6112f9a failed (during ExecutingWorkflowState): Job filter_vcf_depth:NA:1 e
xited with return code 1 which has not been declared as a valid return code. See 'continueOnReturnCode' runtime attribute for more details.
Check the content of stderr for potential additional information: /storage1/fs1/christophermaher/Active/maherlab/tmp/1360_PA_A/cwlruns/cromwell-executions/somatic_wgs.cwl/e1ad4b27-9f38-403d-adc6-a5ddd6112f9a/call-detect_va
riants/detect_variants_wgs.cwl/c5e1d716-8028-4fb4-b2e8-9c3ec6b8b7cc/call-filter_vcf/filter_vcf.cwl/bd37d80f-cd19-4717-845a-8b7df39e9378/call-filter_vcf_depth/execution/stderr.
 [First 300 bytes]:Traceback (most recent call last):
  File "/usr/bin/depth_filter.py", line 105, in <module>
    main()
  File "/usr/bin/depth_filter.py", line 93, in main
    elif(depth < args.minimum_depth):
TypeError: '<' not supported between instances of 'NoneType' and 'int'
Traceback (most recent call last):

2022-11-20 11:58:36,821 cromwell-system-akka.dispatchers.engine-dispatcher-5918 INFO  - WorkflowManagerActor WorkflowActor-e1ad4b27-9f38-403d-adc6-a5ddd6112f9a is in a terminal state: WorkflowFailedState

Execution appeared to stop at this variant

chr2    160016665       .       AA      TT      .       PASS    AC=1;AF=0.25;AN=4;AS_FilterStatus=SITE;AS_SB_TABLE=158%2C166|2%2C1;DP=338;ECNT=1;GERMQ=93;MBQ=20,20;MFRL=198,118;MMQ=60,60;MPOS=43;NALOD=2.0;NLOD=29.19;POPAF=6.0;TLOD=5.74;set=mutect;CSQ=TT|missense_variant|MODERATE|PLA2R1|ENSG00000153246|Transcript|ENST00000283243.12|protein_coding|9/30||ENST00000283243.12:c.1499_1500inv|ENSP00000283243.7:p.Ile500Lys|1706-1707|1499-1500|500|I/K|aTT/aAA|||-1||1|substitution|HGNC|HGNC:9042|YES||1|P3|CCDS33309.1|ENSP00000283243|Q13018||UPI00001AEA9D|||deleterious(0)|possibly_damaging(0.851)|Gene3D:3.10.100.10&Pfam_domain:PF00059&PROSITE_profiles:PS50041&hmmpanther:PTHR22803&hmmpanther:PTHR22803:SF74&SMART_domains:SM00034&Superfamily_domains:SSF56436&Conserved_Domains:cd00037|||||||||||||||||||||||||||||||||MLLSPSLLLLLLLGAPRGCAEGVAAALTPERLLEWQDKGIFVIQSESLKKCIQAGKSVLTLENCKQANKHMLWKWVSNHGLFNIGGSGCLGLNFSAPEQPLSLYECDSTLVSLRWRCNRKMITGPLQYSVQVAHDNTVVASRKYIHKWISYGSGGGDICEYLHKDLHTIKGNTHGMPCMFPFQYNHQWHHECTREGREDDLLWCATTSRYERDEKWGFCPDPTSAEVGCDTIWEKDLNSHICYQFNLLSSLSWSEAHSSCQMQGGTLLSITDETEENFIREHMSSKTVEVWMGLNQLDEHAGWQWSDGTPLNYLNWSPEVNFEPFVEDHCGTFSSFMPSAWRSRDCESTLPYICKKYLNHIDHEIVEKDAWKYYATHCEPGWNPYNRNCYKLQKEEKTWHEALRSCQADNSALIDITSLAEVEFLVTLLGDENASETWIGLSSNKIPVSFEWSNDSSVIFTNWHTLEPHIFPNRSQLCVSAEQSEGHWKVKNCEERLFYICKKAGHVLSDAESGCQEGWERHGGFCYKIDTVLRSFDQASSGYYCPPALVTITNRFEQAFITSLISSVVKMKDSYFWIALQDQNDTGEYTWKPVGQKPEPVQYTHWNTHQPRYSGGCVAMRGRHPLGRWEVKHCRHFKAMSLCKQPVENQEKAEYEERWPFHPCYLDWESEPGLASCFKVFHSEKVLMKRTWREAEAFCEEFGAHLASFAHIEEENFVNELLHSKFNWTEERQFWIGFNKRNPLNAGSWEWSDRTPVVSSFLDNTYFGEDARNCAVYKANKTLLPLHCGSKREWICKIPRDVKPKIPFWYQYDVPWLFYQDAEYLFHTFASEWLNFEFVCSWLHSDLLTIHSAHEQEFIHSKIKALSKYGASWWIGLQEERANDEFRWRDGTPVIYQNWDTGRERTVNNQSQRCGFISSITGLWGSEECSVSMPSICKRKKVWLIEKKKDTPKQHGTCPKGWLYFNYKCLLLNIPKDPSSWKNWTHAQHFCAEEGGTLVAIESEVEQAFITMNLFGQTTSVWIGLQNDDYETWLNGKPVVYSNWSPFDIINIPSHNTTEVQKHIPLCALLSSNPNFHFTGKWYFEDCGKEGYGFVCEKMQDTSGHGVNTSDMYPMPNTLEYGNRTYKIINANMTWYAAIKTCLMHKAQLVSITDQYHQSFLTVVLNRLGYAHWIGLFTTDNGLNFDWSDGTKSSFTFWKDEESSLLGDCVFADSNGRWHSTACESFLQGAICHVPPETRQSEHPELCSETSIPWIKFKSNCYSFSTVLDSMSFEAAHEFCKKEGSNLLTIKDEAENAFLLEELFAFGSSVQMVWLNAQFDGNNETIKWFDGTPTDQSNWGIRKPDTDYFKPHHCVALRIPEGLWQLSPCQEKKGFICKMEADIHTAEALPEKGPSHSIIPLAVVLTLIVIVAICTLSFCIYKHNGGFFRRLAGFRNPYYPATNFSTVYLEENILISDLEKSDQ||||||||||,TT|missense_variant|MODERATE|PLA2R1|ENSG00000153246|Transcript|ENST00000392771.1|protein_coding|9/27||ENST00000392771.1:c.1499_1500inv|ENSP00000376524.1:p.Ile500Lys|1706-1707|1499-1500|500|I/K|aTT/aAA|||-1|||substitution|HGNC|HGNC:9042|||1|A2|CCDS42767.1|ENSP00000376524|Q13018||UPI0000208F9D|||deleterious(0)|possibly_damaging(0.587)|Gene3D:3.10.100.10&Pfam_domain:PF00059&PROSITE_profiles:PS50041&hmmpanther:PTHR22803&hmmpanther:PTHR22803:SF74&SMART_domains:SM00034&Superfamily_domains:SSF56436&Conserved_Domains:cd00037|||||||||||||||||||||||||||||||||MLLSPSLLLLLLLGAPRGCAEGVAAALTPERLLEWQDKGIFVIQSESLKKCIQAGKSVLTLENCKQANKHMLWKWVSNHGLFNIGGSGCLGLNFSAPEQPLSLYECDSTLVSLRWRCNRKMITGPLQYSVQVAHDNTVVASRKYIHKWISYGSGGGDICEYLHKDLHTIKGNTHGMPCMFPFQYNHQWHHECTREGREDDLLWCATTSRYERDEKWGFCPDPTSAEVGCDTIWEKDLNSHICYQFNLLSSLSWSEAHSSCQMQGGTLLSITDETEENFIREHMSSKTVEVWMGLNQLDEHAGWQWSDGTPLNYLNWSPEVNFEPFVEDHCGTFSSFMPSAWRSRDCESTLPYICKKYLNHIDHEIVEKDAWKYYATHCEPGWNPYNRNCYKLQKEEKTWHEALRSCQADNSALIDITSLAEVEFLVTLLGDENASETWIGLSSNKIPVSFEWSNDSSVIFTNWHTLEPHIFPNRSQLCVSAEQSEGHWKVKNCEERLFYICKKAGHVLSDAESGCQEGWERHGGFCYKIDTVLRSFDQASSGYYCPPALVTITNRFEQAFITSLISSVVKMKDSYFWIALQDQNDTGEYTWKPVGQKPEPVQYTHWNTHQPRYSGGCVAMRGRHPLGRWEVKHCRHFKAMSLCKQPVENQEKAEYEERWPFHPCYLDWESEPGLASCFKVFHSEKVLMKRTWREAEAFCEEFGAHLASFAHIEEENFVNELLHSKFNWTEERQFWIGFNKRNPLNAGSWEWSDRTPVVSSFLDNTYFGEDARNCAVYKANKTLLPLHCGSKREWICKIPRDVKPKIPFWYQYDVPWLFYQDAEYLFHTFASEWLNFEFVCSWLHSDLLTIHSAHEQEFIHSKIKALSKYGASWWIGLQEERANDEFRWRDGTPVIYQNWDTGRERTVNNQSQRCGFISSITGLWGSEECSVSMPSICKRKKVWLIEKKKDTPKQHGTCPKGWLYFNYKCLLLNIPKDPSSWKNWTHAQHFCAEEGGTLVAIESEVEQAFITMNLFGQTTSVWIGLQNDDYETWLNGKPVVYSNWSPFDIINIPSHNTTEVQKHIPLCALLSSNPNFHFTGKWYFEDCGKEGYGFVCEKMQDTSGHGVNTSDMYPMPNTLEYGNRTYKIINANMTWYAAIKTCLMHKAQLVSITDQYHQSFLTVVLNRLGYAHWIGLFTTDNGLNFDWSDGTKSSFTFWKDEESSLLGDCVFADSNGRWHSTACESFLQGAICHVPPETRQSEHPELCSETSIPWIKFKSNCYSFSTVLDSMSFEAAHEFCKKEGSNLLTIKDEAENAFLLEELFAFGSSVQMVWLNAQFDGNSK|||||||||| GT:AD:AF:DP:F1R2:F2R1:FAD:SB:MQ0:MQ0FRAC
        0/1:187,3:0.029:190:74,2:54,0:133,3:92,95,2,1:0:0.0     0/0:137,0:0.009932:137:50,0:38,0:97,0:66,71,0,0:.:.     ./.:.:.:.:.:.:.:.:.:.

I see the Depth field does have a depth with an integer, so i'm a bit confused why execution would stop here, do you guys see something i'm not?

tagging @chrisamiller and @johnegarza

bryanfisk commented 1 year ago

Looks like the tumor and normal sample names used didn't match the sample names in the tumor and normal input bams. When the detect variants subworkflow merged the VCF outputs from the multiple variant callers the results for the normal sample in the VCF file was split across two sample names and caused some variants to have no value the normal sample column that was being filtered.