brentp / slivar

genetic variant expressions, annotation, and filtering for great good.
MIT License
252 stars 23 forks source link

missing DNMs from Strelka call #151

Closed weizhu365 closed 1 year ago

weizhu365 commented 1 year ago

I have tried to use slivar to call DNM from Strelka prediction:

slivar expr --vcf output/slivar/strelka_t0334.tmp0.vcf.gz \ --ped ped_files/t0334.ped --pass-only \ --out-vcf output/slivar/strelka_t0334.dnm.vcf \ --trio "denovo:kid.het && mom.hom_ref && dad.hom_ref \ && kid.AB > 0.25 && kid.AB < 0.75 \ && (mom.AD[1] + dad.AD[1]) <= 5 \ && kid.GQ >= 20 && mom.GQ >= 20 && dad.GQ >= 20 \ && kid.DP >= 30 && mom.DP >= 30 && dad.DP >= 30"

And several variants were failed in the slivar filtering, for example: chr1 243356830 . GA G 659 PASS CIGAR=1M1D;RU=A;REFREP=10;IDREP=9;MQ=60 GT:GQ:GQX:DPI:AD:ADF:ADR:FT:PL 0/0:136:136:57:47,0:24,0:23,0:PASS:0,139,999 0/0:184:184:75:63,0:33,0:30,0:PASS:0,187,999 0/1:435:27:70:28,38:19,20:9,18:PASS:670,0,432

In my manual inspection, I think it should be "PASS". Could you tell me why it was failed to pass the filters?

Thanks,

Wei

brentp commented 1 year ago

Hi Wei, could you share a VCF with only this variant and the VCF header along with the ped file for these 3 samples?

weizhu365 commented 1 year ago

Many thanks for your prompt reply. Here are the input files you requested:

strelka_issue.zip

brentp commented 1 year ago

Hi Wei, when I run your variant, I see this warning:

slivar] javascript error. this can some times happen when a field is missing.
error from duktape: unknown attribute:DP for expression:kid.het && mom.hom_ref && dad.hom_ref && kid.AB > 0.25 && kid.AB < 0.75 && (mom.AD[1] + dad.AD[1]) <= 5 && kid.GQ >= 20 && mom.GQ >= 20 && dad.GQ >= 20 && kid.DP >= 30 && mom.DP >= 30 && dad.DP >= 30

note unknown attribute:DP.

slivar tries to warn on these cases as sometimes a field is missing from a few VCFs, but the expression can not pass without that information. I change your expression to:

denovo:kid.het && mom.hom_ref && dad.hom_ref && kid.AB > 0.25 && kid.AB < 0.75 && (mom.AD[1] + dad.AD[1]) <= 5 && kid.GQ >= 20 && mom.GQ >= 20 && dad.GQ >= 20 && kid.AD[0] + kid.AD[1] >= 30 && mom.AD[0] + mom.AD[1] >= 30 && dad.AD[0] + dad.AD[1] >= 30

and it passes as expected. Hope this helps. -Brent

weizhu365 commented 1 year ago

Thank you for the solution. I found that some variants in the same VCF file had passed the filters, which having DP attributes. It is weird that not all variants have the same attributes in the Strelka output.

brentp commented 1 year ago

From the header of your VCF, it looks like it uses DPI for indels and DP for other variants.

weizhu365 commented 1 year ago

That makes sense. Best,