Closed 0xaf1f closed 1 month ago
Hi @0xaf1f,
We are currently investigating your issue. I will post an answer here shortly.
Thanks for your question.
Regards, @diegomscoelho
Hello @0xaf1f,
Sorry for delay in re-visiting this issue. I cannot check the exact example you provided but can re-produce the case in human GRCh38 with input -
1 230714122 . C CT
results in upstream_gene_variant
with DISTANCE=0
#Uploaded_variation Location Allele Gene Feature Feature_type Consequence cDNA_position CDS_position Protein_position Amino_acids Codons Existing_variation Extra
1_230714123_-/T 1:230714122-230714123 T ENSG00000135744 ENST00000366667 Transcript upstream_gene_variant - - - - - - IMPACT=MODIFIER;DISTANCE=0;STRAND=-1
For insertions it makes sense. As the variant position is considered to be the flanking bases where the insertion actually happens (see here) making start (or end) of the variant position same as transcript start (or end). The inserted sequence itself is outside the transcript so it is correct to say the effect is upstream.
I have added a PR to make it clear in the doc here - https://www.ensembl.org/info/docs/tools/vep/vep_formats.html#defaultout
As the issue has been stale for long time I will close it.
Best regards, Nakib
Thanks -- in my example data, the gene for which this was annotated as distance 0 was has coordinates 3793257-3794867 on the minus strand.
Because the variant is
1 3794867 . C CA
you're right that it is in fact upstream-- just before the start. Although the distance=0 is still odd. I'd have expected a distance 1 perhaps, but I suppose that's not as big a deal.
Describe the issue
We noticed a variant, a single base insertion after the first position of a gene, getting annotated as an upstream_gene_variant with respect to that gene. The distance is then 0 since it's not actually upstream, but why isn't it instead called a frameshift mutation?
This is the problematic annotation:
Additional information
System
Full VEP command line
Full error message
N/A
Data files (if applicable)
vep-distance-issue.tar.gz