Closed ccwang002 closed 5 years ago
Good point. I will make them - just have to finish first with Ensembl 97.
Just added to AnnotationHub
:
> library(AnnotationHub)
> ah <- AnnotationHub()
snapshotDate(): 2019-05-02
> query(ah, c("EnsDb", "v79"))
AnnotationHub with 1 record
# snapshotDate(): 2019-05-02
# names(): AH73986
# $dataprovider: Ensembl
# $species: Homo sapiens
# $rdataclass: EnsDb
# $rdatadateadded: 2019-05-02
# $title: Ensembl 79 EnsDb for Homo sapiens
# $description: Gene and protein annotations for Homo sapiens based on Ensem...
# $taxonomyid: 9606
# $genome: GRCh38
# $sourcetype: ensembl
# $sourceurl: http://www.ensembl.org
# $sourcesize: NA
# $tags: c("79", "AHEnsDbs", "Annotation", "EnsDb", "Ensembl", "Gene",
# "Protein", "Transcript")
# retrieve record with 'object[["AH73986"]]'
Wow, thanks for making it available in such as short time! I just downloaded it and the ensdb has everything I need. Really appreciate your help!
I was wondering if you can help build the human hg38 EnsDb of Ensembl release v79.
We need this specific version because the data of most cancer consortia are processed and processed on NCI Genomic Data Commons (GDC), which uses GENCODE v22 as their gene annotation. GENCODE v22 should be equivalent to Ensembl v79 based on the comments in its GTF.
I tried to convert GDC's GTF directly into a TxDb-compatiable SQLite database but I couldn't get it working.
GenomicFeatures::makeTxDbFromGFF()
dropped many valuable information including gene symbol and biotype, making the resulting TxDb less useful.ensembldb::ensDbFromGtf()
failed at the index creation step:Therefore, it might be easier to build a standard Ensembl v79 EnsDb from scratch. If you can build one and make it accessible via AnnotationHub, it will also benefit the broader community using data from GDC.