brentp / slivar

genetic variant expressions, annotation, and filtering for great good.
MIT License
248 stars 23 forks source link

list of impactful vocab #100

Closed team-tomato-salad closed 2 years ago

team-tomato-salad commented 3 years ago

Hi,

I am following up with the adjusted impactful txt. Thank you so much for your fast reply last time. I was trying to get "missense" only variants, this time. This is what the adjusted-order.txt looks like:

missense

IMPACT_CUTOFF is a special value that slivar uses to set INFO.impactful for any variant

However, 0 missense variant was reorganized. I noticed that some of missense variants from pre-filter VCF CSQ column looks like this:

missense_variant&NMD_transcript_variant missense_variant&splice_region_variant&NMD_transcript_variant missense_variant

Pervasively, I did something similar with "frameshift" in the adjusted-order.txt and the resulting VCF did include variants such as frameshift_variant, stop_gained&frameshift_varian, etc. I'd like to know if there's anything I can do to fix this.

Thank you in advance!

brentp commented 3 years ago

Hi, you shouldn't need to mess with the order to do this, you can follow this: https://github.com/brentp/slivar/wiki/impactful#infohighest_impact_order

and use something like:

INFO.highest_impact_order == ImpactOrder.missense

to get only missense variants. Note that this will exclude variants where one transcript is missense and another has a higher impact (should be rare, but will happen). If you have trouble with this, then please post a VCF with single variant that should be extracted but is not, along with the command you used and I can help you debug rapidly.

brentp commented 2 years ago

Closing as resolved. Please add further info if you're still having issues.