ohnosequences / db.rna16s

A comprehensive, compact, and automatically curated 16S database
GNU Affero General Public License v3.0
8 stars 0 forks source link
16s bacteria bioinformatics database rna16s

db.rna16s

db.rna16s is a curated database of 16S sequences, obtained directly form db.rnacentral. This package contains code to filter the data from RNACentral releases, as well as pointers to the location of the data.

For each supported version of db.rnacentral, a single FASTA is available, containing a subset of the RNACentral sequences that are identified as 16S.

How to access the data

Versions

All the data in db.rna16s is versioned following the RNACentral releases number scheme.

Each of these versions is encoded as an object that extends the sealed class Version.

The Set Version.all contains all the releases supported and maintained through db.rna16s.

Files

The module db.rna16s.data contains the pointers to the S3 objects where the actual data is stored. The paths of the S3 objects corresponding to the FASTA file and mappings file, respectively, can be accessed evaluating the following methods over a Version object:

sequences : Version => S3Object
mappings  : Version => S3Object

The path to the S3 object returned by those functions look something like the following:

s3://resources.ohnosequences.com/ohnosequences/db/rna16s/<version>/rna16s.fa
s3://resources.ohnosequences.com/ohnosequences/db/rna16s/<version>/mappings

License

See the open data commons FAQ for more on this distinction between database and contents.