Closed RichardCorbett closed 3 years ago
The simplest way that I can see is to extract those variants (e.g. using grep splice_region_variant ...
) and then re-annotate using a combination of parameters:
-spliceRegionExonSize <int> : Set size for splice site region within exons. Default: 3 bases
-spliceRegionIntronMin <int> : Set minimum number of bases for splice site region within intron. Default: 3 bases
-spliceRegionIntronMax <int> : Set maximum number of bases for splice site region within intron. Default: 8 bases
So, you can re-annotate the extracted splice_region_variant
variants using -spliceRegionIntronMin 0 and -spliceRegionIntronMax 0
, and filter again splice_region_variant
(i.e. use grep
one more time), thus you'll get a list of the variants that do NOT overlap with the intron (i.e. the ones that overlap with the exon side).
I hope this helps.
Closing, feel free to reopen.
Perfect. Thanks @pcingola!
Hi folks,
I am using snpEff to annotate variants called from hg19 aligned short reads. I am relying heavily on the impact/effect information that is reported. One recent task I've been tackling is how to access the "LOW" impact variants that overlap with the coding regions. I've gone through each of the items in the link below to see which types of variants might have "LOW" impact that intersect with coding regions. Most every variant type is clearly overlapping coding regions or not, but the "splice_region_variant" appears to be listed for a group of situations that include variants in the exons and/or in introns. https://pcingola.github.io/SnpEff/se_inputoutput/#effect-prediction-details
Is there a way to parse the snpEff output to get the only the LOW impact variants that overlap with the coding space of the annotations?