Missing ensembl versions

emdann opened 1 year ago

emdann commented 1 year ago


The default Ensembl version in ensembldb is v86, but this doesn't seem to exist in the sql database.

import pooch

PKG_CACHE_DIR = "genomic-annotations"
url_template = "{version}/EnsDb.{species}.v{version}.sqlite"

local_path = pooch.retrieve(
        url = url_template.format(version="86", species="Hsapiens"),
Downloading data from '' to file '/home/jovyan/.cache/genomic-annotations/1c849fa5b54f90367dbde8fd3f560e9a-EnsDb.Hsapiens.v86.sqlite'.
HTTPError: 404 Client Error: The specified blob does not exist. for url:

The EnsemblDB class should have a method to check (and return?) available versions

Version information

ivirshup commented 1 year ago

This specific version was not uploaded to annotationhub, which may apply to other versions as well. Some possible solutions:

jorainer commented 1 year ago

AnnotationHub provides EnsDb sqlite databases from Ensembl release 87 on. It would be possible to create/add also older versions, but (to not increase storage demand of the AnnotationHub too much) I would only do that for selected versions - and if there is need.

ivirshup commented 1 year ago

I guess I don't have a specific use-case ATM for accessing other older versions (86 came up since it's used in the ensembldb vignette). Do you have some idea of how often the older versions are used?

It would be nice to have parity in access from python and R.

but (to not increase storage demand of the AnnotationHub too much)

This might intersect with a conversation I was just having with @lshep on the bioc slack about compression of these files. Is this something you've considered for the sqlite databases?

jorainer commented 1 year ago

honestly - I don't know which releases are predominantly used - maybe there is a "usage/download" log for AnnotationHub (pinging @lshep ).

For compressing the sqlite files - AFAIK R can not read from gzipped SQLite files, so the files (if compressed) would need to be unzipped locally first (could be something that AnnotationHub could actually also do on the fly?).