brentp / slivar

genetic variant expressions, annotation, and filtering for great good.
MIT License
249 stars 23 forks source link

How to filter for VEP plug-in annotations in INFO field #95

Closed seboyden closed 3 years ago

seboyden commented 3 years ago

I'm trying to filter for VEP plug-in splice prediction annotations in the INFO field, by e.g.

--info "INFO.SpliceAI_pred_DS_AG >= 0.5"

and I get error

[slivar] error evaluating info expression (this can happen if a field is missing):
error from duktape: unknown attribute:SpliceAI_pred_DS_AG for expression:INFO.SpliceAI_pred_DS_AG >= 0.5

The input VCF header notes these splice plug-in fields like this:

##SpliceRegion=SpliceRegion predictions
##MaxEntScan_alt=MaxEntScan alternate sequence score
##MaxEntScan_diff=MaxEntScan score difference
##MaxEntScan_ref=MaxEntScan reference sequence score
##SpliceAI_pred_DS_AG=SpliceAI predicted effect on splicing. Delta score for acceptor gain
##SpliceAI_pred_DS_AL=SpliceAI predicted effect on splicing. Delta score for acceptor loss
##SpliceAI_pred_DS_DG=SpliceAI predicted effect on splicing. Delta score for donor gain
##SpliceAI_pred_DS_DL=SpliceAI predicted effect on splicing. Delta score for donor loss
##ada_score=dbscSNV ADA score
##rf_score=dbscSNV RF score
##GeneSplicer=GeneSplicer predictions

as opposed to e.g. annotations from make-gnotate which describe their INFO field formatting like this:

##INFO=<ID=gnomad_popmax_af,Number=1,Type=Float,Description="field from from gnotate VCF">

I'm wondering how can I write my Slivar expression to find and use these plug-in annotations, or is this a problem with how the plug-ins format their header lines?

brentp commented 3 years ago

Hi Steven, I think those don't have their own field in the INFO, right? They are crammed into the single VEP field. In order to use slivar like that, you'd have to split them into a separate field. There is a bcftools plugin split-vep that can do this for you.