Matthias König and Jan Grzegorzewski

AnnotateDB

Overview

[^] AnnotateDB (pronounced annotated bee) is a database with web frontend for mapping of annotations found in computational models in biology. AnnotateDB is accessible via https://annotatedb.com.

Our mission is to provide mapped annotation resources which simplify annotation of computational models and mapping of entities in such models.
Our vision is to provide a single integrated knowledge resource which simplifies mapping between commonly occurring annotations in biological models and data.

AnnotateDB provides a high quality mapping of annotations on each other based on existing resources. Features are

annotation mappings from multiple sources
support for custom annotation mappings
support for qualifiers, i.e., more detailed relationships between annotations
support for evidence of annotations, i.e., provenance about the source and method with which the mapping was inferred
direct access to the postgres database
docker and docker-compose scripts for easy local setup and deployment
REST based web interface
elastisearch based indexing and search (The elasticsearch end points are still in development and will be part of v0.2.0)

AnnotateDB is accessible under the following licenses

Source Code: LGPLv3
Documentation: CC BY-SA 4.0

To cite the project use

Installation

[^] AnnotateDB is distributed as docker containers, requiring a working docker and docker-compose installation.

To install AnnotateDB locally use

# clone or pull the latest source code
git clone https://github.com/matthiaskoenig/annotatedb.git
cd annotatedb

# set environment variables
set -a && source .env.local 

# create/rebuild all docker containers
./docker-purge.sh

# restore database
./adb_restore.sh

# elasticsearch indexing
./elasticsearch.sh

This creates the following services

adb_postgres http://localhost:5434/ - postgres database
adb_backend http://localhost:5434/ - django backend
adb_frontend http://localhost:8090/ - vue.js frontend
adb_elasticsearch http://localhost:9124/ - elasticsearch instance

In later releases the installation will be simplified, i.e., prebuild docker containers will be available from dockerhub (see #32).

REST webservice

[^] AnnotateDB provides REST endpoints for querying the database at https://annotatedb.com/api/v1.

Some examples

to query the collectionsuse: https://annotatedb.com/api/v1/collections/?format=json.
to query the mappingsuse: https://annotatedb.com/api/v1/mappings/?format=json.
to query a single collection use https://annotatedb.com/api/v1/collections/sbo/?format=json.

This will return the information on the collection, in this example for sbo

{
  "namespace":"sbo",
  "miriam":true,
  "name":"Systems Biology Ontology",
  "idpattern":"^SBO:\\d{7}$",
  "urlpattern":"https://identifiers.org/sbo/{$id}"
}

Currently only basic REST endpoints are available. With the introduction of the elasticsearch endpoints in v0.3.0 the REST based search will largely improve. For now users should directly interact with the postgres database to interact with the mappings (see information below).

Postgres database

[^] The postgres database is accessible via

HOST: localhost
PORT: 5434
DB: adb
USER: adb
PASSWORD: adb

The database contains the following main tables (see schema below):

adb_collection: A data source or miriam collection for annotation or xref information
adb_annotation: The combination of a term from a collection and the given collection
adb_mapping: Mapping between annotations, from source annotation to target annotation. The kind of mapping is defined by the qualifier. E.g. the qualifier BQM_IS encodes that the source annotation is the target annotation.
adb_evidence: Evidence for the given mapping between annotations.

In addition the materialized view mapping_view is provided which allows easy filtering and search of mapped annotations and annotation synonyms. For most use cases the mapping_view is the table to work with.

SQL queries

For instance query the bigg.metabolite for a given chebi identifier via

SELECT source_term FROM mapping_view 
    WHERE (target_term = 'CHEBI:698' AND
           target_namespace = 'chebi' AND 
           source_namespace = 'bigg.metabolite' AND
           qualifier = 'IS');

which results in

('10fthf',)

A more comprehensive list of SQL queries and use cases is provided here with output here.

Data sources

[^] AnnotateDB uses the following data sources:

Collections

identifiers.org

Information on collections is based mainly on identifiers.org. Collections were parsed with sbmlutils.

Mappings

BiGG

A major source of annotation mappings is the BiGG Database with information used from the latest database release. AnnotateDB currently includes BiGG-v1.5.

Release notes

[^] This section provides an overview of major changes and releases

0.2.0

security fixes
django update (>3.0), elasticsearch update (7.7.1), postgres update (12.3),
replacing deprecated django-rest-swagger with drf-yasg

0.1.1

bug fixes admin interface
bug fixes frontend server
enforcing uniqueness of mappings & removing duplicates
materialized views
detailed postgres examples
updated documentation

0.1.0

vue frontend
bigg mappings import
database release files

0.0.1

django development server
first database schema
docker-compose files for backend, database and elasticsearch

Acknowledgements

[^] We acknowledge

Dr. Andreas Dräger
Thomas Zajac

for their input and discussions.

matthiaskoenig / annotatedb

readme