ohnosequences / db.rnacentral

Mirror, filtered annotations, and BLAST DBs from RNAcentral
http://rnacentral.org/
GNU Affero General Public License v3.0
2 stars 0 forks source link

db.rnacentral

db.rnacentral contains code to mirror the data from RNACentral releases, as well as pointers to the location of the data.

For each supported version of RNACentral, two files are available:

How to access the data

Versions

All the data in db.rnacentral is versioned following the RNACentral releases number scheme.

Each of these versions is encoded as an object that extends the sealed class Version.

The Set Version.all contains all the releases supported and maintained through db.rnacentral.

Files

The module db.rnacentral.data contains the pointers to the S3 objects where the actual files are stored. The path of the S3 objects corresponding to the id mappings and the sequence data can be accessed evaluating the following functions over a Version object:

idMappingTSV         : Version => S3Object
speciesSpecificFASTA : Version => S3Object

A convenient value grouping both files can be accessed (again parametrized by the version) through the function:

everything : Version => S3Object

The path to the S3 objects returned by those functions something like the following:

s3://resources.ohnosequences.com/ohnosequences/db/rnacentral/<version>/id_mapping.tsv
s3://resources.ohnosequences.com/ohnosequences/db/rnacentral/<version>/rnacentral_species_specific_ids.fasta

References

1: ftp://ftp.ebi.ac.uk/pub/databases/RNAcentral/releases/10.0/id_mapping/readme.txt

2: ftp://ftp.ebi.ac.uk/pub/databases/RNAcentral/releases/10.0/id_mapping/readme.txt