Closed ju-mu closed 2 weeks ago
Hi @ju-mu,
Sorry for the delay in replying and thank you for reporting this issue.
You are right that VEP doesn't currently report back transcripts on genes ND1-6 and ND4L for RefSeq data, even though they are available on cache. I already opened PR https://github.com/Ensembl/ensembl-vep/pull/1701 to fix this bug in the next version of VEP.
In the meantime, please use the hidden flag --all_refseq
in VEP to return all RefSeq transcripts in our cache, including tissue-specific transcripts starting with compmerge
. If you are not interested in those compmerge
transcripts, please run filter_vep
on the VEP results using a command similar to:
filter_vep -i vep_output.txt --filter "Feature not matches compmerge" -o vep_filtered.txt
Hope this helps and sorry for the inconvenience.
Kind regards, Nuno
Switching --all_refseq
on for chrMT fixed it for us.
I haven't seen any compmerge
transcripts in the output so far.
Thanks a lot!
Hi @ju-mu,
Hope you are having a great day!
Just to update you that this bug is now fixed in the next version of VEP: all the expected mitochondria RefSeq transcripts will be returned in VEP 113 without the need to use --all_refseq
.
Thanks for reporting this issue.
Cheers, Nuno
As already mentioned in #1659 seven genes ND1-6 + ND4L are missing whenever the RefSeq annotation is used.
This affects the web interface and the cli, hg19 and hg38 and at least the last few versions of vep 112 ( tested until ~109) using the database or offline cache.
It can be verified by any known variant described within this genes such as: rs193302971
The genes are found using ensembl or merged annotations.
Looking at the cache directory in homo_sapiens_refseq/112_GRCh37/MT/1-1000000.gz the genes seem to be present.
But I really wonder why they are not reported?
Thank you!