brentp / slivar

genetic variant expressions, annotation, and filtering for great good.
MIT License
247 stars 23 forks source link

group expression on customized FORMAT field is not working #137

Closed lindakjcao closed 2 years ago

lindakjcao commented 2 years ago

Hi Brentp,

I tried extract the DeNovo variants generated by DRAGEN pipeline using Slivar. I have an alias file ready, so the VCF record is looking like:

CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Father Kid Mom

chr1 10146 . AC A 63.33 PASS AC=3;AF=0.75;AN=4;DP=154;FS=6.532;MQ=18.47;MQRankSum=0.742;QD=3.33;ReadPosRankSum=0.742;SOR=2.07;proband=KId GT:AD:AF:DP:GQ:FT:F1R2:F2R1:PL:GP:PP:DN ./.:86,45:0.344:101:0:PASS:.:.:.:.:.:. 0/1:5,7:0.583:12:5:LOW_QUAL_INDEL:2,2:3,5:33,0,10:5.499,1.6716,14.251:35,0,279:. 1/1:1,6:0.857:7:14:LOW_QUAL_INDEL:0,2:1,4:63,17,0:30.506,13.853,0.18668:46,0,293:.

My command line is: slivar expr \ --vcf roband-variants.slivar.expr.vcf \ \ --ped ../Fam.ped \ --alias Fam.slivar.alias \ --js filters.js \ \ --group-expr "denovo:kid.DN == 'DeNovo'" \ --out-vcf proband-variants.slivar.denovo.vcf

I ran the cmd line and received error like: slivar version: 0.1.5 3 samples matched in VCF and PED to be evaluated [slivar] javascript error. this can some times happen when a field is missing. error from duktape: unknown attribute:DN for expression:kid.DN == DeNovo

[slivar] occured with variant:chr1 10146 . AC A 63.33 PASS AC=3;AF=0.75;AN=4;DP=154;FS=6.532;MQ=18.47;MQRankSum=0.742;QD=3.33;ReadPosRankSum=0.742;SOR=2.07;proband=Kid GT:AD:AF:DP:GQ:FT:F1R2:F2R1:PL:GP:PP:DN ./.:86,45:0.344:101:0:PASS:.:.:.:.:.:. 0/1:5,7:0.583:12:5:LOW_QUAL_INDEL:2,2:3,5:33,0,10:5.499,1.6716,14.251:35,0,279:. 1/1:1,6:0.857:7:14:LOW_QUAL_INDEL:0,2:1,4:63,17,0:30.506,13.853,0.18668:46,0,293:.

Any suggestions? Thank you.

Regards, Linda

brentp commented 2 years ago

Hi, if you have a true trio, just use --trio without the groups file (and with the pedigree file). I think you can use the same expression, just use --trio in place of --group-expr.

Otherwise, can you show the header for DN and the pedigree file you're using?

lindakjcao commented 2 years ago

Thanks for replying me so quickly. Actually, the family is not always trio, can be duo, quad or singleton.

Here is the DN:

FORMAT=

FORMAT=

Here is the pedigree file:

Family_ID Individual_ID Paternal_ID Maternal_ID Sex Phenotype

F0002 Father 0 0 1 0 F0002 Mom 0 0 2 0 F0002 Kid Father Mom 2 0

Thank you, Linda

brentp commented 2 years ago

ok. your pedigree file doesn't have any affected samples, but that's OK if you're not using --trio . By default, slivar doesn't expose string fields from the format, you can force it to do so using:

export SLIVAR_FORMAT_STRINGS=yes

in your shell. then you should have access to DN

lindakjcao commented 2 years ago

Just tried, still same error... Thank you.

brentp commented 2 years ago

can you show the full command and full stderr again? including the export?

lindakjcao commented 2 years ago

The cmd line I put in the post is the full command line and the full stderr is very long and repeated the same error as I put in the post. export is an empty vcf with just headers of VCF.

brentp commented 2 years ago

oh, you are using a very old version of slivar. Please update to the latest as string access to format fields was added after v0.1.5

lindakjcao commented 2 years ago

I see. Let me try a newer version. Thank you.

lindakjcao commented 2 years ago

Hi, I just tried 0.2.7. There are more issues (with VCF) shown, including the old issue I put in this post and some records don't have DN in the format (multi-allelic sites). And this is something easier to handle by a simple grep cmd line to grep all DeNovo records. Thank you very much for looking into this so quickly. Really appreciate it! Your tool is a great application, we all like to use it. Regards, Linda

brentp commented 2 years ago

if DN is absent, you can use something like:

--group-expr "(denovo:'DN' in kid) && (kid.DN == 'DeNovo')"
lindakjcao commented 2 years ago

its absent in kid as well... Thanks though!

brentp commented 2 years ago

right, so the 'DN' in kid checks for presence to avoid the attribute error.