Closed ccwang002 closed 5 years ago
Thanks for your feedback @ccwang002 . I am not storing the version information for the transcripts (and genes, exons etc) in the EnsDb
databases because they should be fixed/constant for the same Ensembl release. I thought that having different EnsDb
databases for different Ensembl version would suffice (hence skipping the transcript versions).
If you really require that information I could an additional column to the database. I would however then have to update also all EnsDb
databases in AnnotationHub
(just to explain why I am hesitant).
If we would add this we would have to be consistent and add also the gene_id_version
. So:
transcript_id_version
column to transcript
table.gene_id_version
column to gene
table.In the Perl API we would have to use the ->stable_id_version()
method to extract the respective ID with version appended.
OK, so I will implement this.
Done - I've to create some EnsDb
s first to check if it works. Then I can go ahead to re-create all EnsDb
databases from AnnotationHub
- most likely I will just do it (first) for Ensembl version 94.
Updating the EnsDb
s on AnnotationHub
:
@ccwang002 , for the (checked) versions above I have already uploaded updated EnsDb
databases to AnnotationHub
. You should be able to use them right away. If you use these databases you will get the additional columns tx_id_version
and gene_id_version
by default with the genes
, transcripts
, ... calls. You don't need to update ensembldb
for that.
@jotsetung Thank you very much for your help! I was able to get the id versions from the new EnsDbs.
By the way, great work for maintaining and developing ensembldb
. It is easy to use and powerful.
Just an update: I've updated the EnsDb
for Ensembl versions 90 to 94 hosted on AnnotationHub
. All these contain now also the versioned gene and transcript IDs.
I was using the EnsDb database of Ensembl release 90 from AnnotationHub
AH57757
, and I was wondering if EnsDb can include the transcript version in the database as well.For example, there are 4 transcripts associated with a human gene GATA3,
Instead of just having the transcript ID like
ENST00000481743
andENST00000379328
, it would be nice to have an option to display the transcript version as well, likeENST00000481743.2
andENST00000379328.8
. Sometimes it is quite helpful to have the full version of the transcript so when a project involves multiple versions of Ensembl annotation, it is easier to tell if any transcript annotation has changed. Otherwise, the user has to go back to the transcript GTF to retrieve that information.Thanks again for making this tool.