Closed chapmanb closed 6 years ago
is it always the literal <NON_REF>
? If so, perhaps that could be a special case where any alternate allele will match (including, for example a variant that was T -> TC for your exapmle above).
-permissive-overlap
is doing exactly what it should in the case you describe. I wasn't sure if your "confusingly" was referring to the behavior of vcfanno or the appearance of the result to the user.
Brent;
Having <NON_REF>
be an ALT wildcard would work great. Matching to same position insertions works okay, it's mainly the same position deletions that are off due to the padding bases used in the VCF representation.
Sorry for not writing clearly above. vcfanno -permissive-overlap
is doing the right thing, it's just that the outcomes are confusing/misleading when not considering the REF allele.
Thanks again for considering this.
I have a simple change that makes this the default. I'm doing some testing to make sure it doesn't break anything and will make a new release whem I'm sure it doesn't.
this seems to be working. there's an impending release of go that gives about a 3-4% performance improvement over the current version. I'll wait for that to release the next vcfanno version.
Brent; Awesome, thanks so much. I'll roll a new bioconda package and test and soon as the more flexible and faster vcfanno gets released. Thanks again.
this is fixed in latest release. v0.3.0
Brilliant -- thanks so much Brent. I've updated the bioconda recipe so this is now available there. Much appreciated.
Brent; @cariaso and I have been working on annotating all position reference based calls with dbSNP rs IDs using vcfanno. We're starting with gVCFs from GATK4 called using
--emit-ref-confidence BP_RESOLUTION
which give outputs at reference0/0
positions that look like:When running vcfanno with a dbSNP VCF none of the reference calls get annotated with the rsIDs because the ALTs don't match with the NON_REF. We'd like to be able to associate with SNP positions even when we don't have calls there.
To do this, we swapped to using
-permissive-overlap
, which mostly works but also confusingly annotates at deletions like:since the positions overlap but the non-padded source of the deletion does not.
Have you run into this issue and have any advice/suggestions for how best to use vcfanno? We had thought of an approach to allow a new matching criteria with position + REF only, which Mike named
-slightly-permissive-overlap
. What do you think about that approach? Any other ideas for how to better accomplish what we're trying to do? Thanks much.