bmrb-io / BMRB-API

BMRB API server and client implementations.
GNU General Public License v3.0
9 stars 2 forks source link

BMRB-API

About

The Biological Magnetic Resonance Bank is developing a REST API to use for querying against the substantial BMRB databases. Most data in the macromolecule and metabolomics databases is accessible through the API.

While the API is free to use and doesn't require an API key, if you submit a large number of queries you may be rate limited.

The URL of the API is api.bmrb.io. If you navigate there you will see links to all active versions of the API.

To see an interactive Jupyter notebook that demonstrates using the BMRB and PDB APIs, please click Binder.

Versioning

New API releases will be available at a unique URL so that you can continue to rely on a given API version as we continue to improve and develop the API. We intend to keep each released version of the API live as long as is feasible.

Releases will be named with a major and minor version and optionally a revision number. For example, the current release version is v2.0. This means that it is the 2nd API version and there have been zero minor releases.

Major release version incrementing means that the API may have changed in a way that breaks existing queries. Therefore you should write your applications to use a specific major version and only change the URL you query once you are sure your application is up to date with any changes.

Minor release version incrementing means that new features may have been released but the API should still be perfectly backwards compatible. As a result our release URLs will not include the minor release version.

Revision numbers incrementing means that bugs have been patched but no other substantial changes to the API have been made.

API URLs

From the root of the API server, first add the version you want to query. For example, api.bmrb.io/v2/ for the release version v2, or api.bmrb.io/current/ to ensure your query goes to the current API version, whatever that is at the time. It is suggested that you use the /current/ version for development and a fixed version for software releases. This will ensure that your deployed applications do not break if a new API version is released.

HTTPS is available, though due to the overhead in establishing a TLS session, slightly slower. It is only recommended if you are uploading private data to the server or calling methods on previously uploaded data.

Results

Certain queries return results on an entry, saveframe, or loop in JSON format. To see how we convert our NMR-STAR entries, saveframes, and loops into JSON format please see the reference here.

Rate limiting

We have a rate limit enforced in order to guarantee a responsive API server. It is unlikely that you will encounter the limit, but if you do you will receive a HTTP 403 error as a response to all requests. Please ensure to check for this error in your applications and wait before sending further queries.

If you are blacklisted simply wait at least 10 seconds before sending ANY queries and you will be removed from the blacklist.

Limits:

We reserve the right to increase or decrease these limits in the future without warning.

If you need to perform a lot of queries and the rate limit is a problem for you please contact us at help@bmrb.io to get an exception.

What we ask of you

If using the API in an application you distribute to others, please include the HTTP header 'Application' whose value is the name of your application, a space, and then the version number of your application. This allows us to track API usage more accurately and determine when we can end-of-life old API versions. Some examples:

Python:

import requests
requests.get("http://api.bmrb.io/v2/status", headers={"Application":"My Application"})

Curl:

curl "http://api.bmrb.io/v2/status" -H 'Application: Curl Script'

REST API

All queries return results in JSON format by default, and optionally also in text format, or other formats pertinent to the query. See the documentation for each method to determine which other formats are available.

Databases

The BMRB API has 3 databases. They are:

Note that not all databases contain the same tables. In general, a search in a table that a given database doesn't contain will not produce an error; instead it will return that no results were found.

Queries use the macromolecules database by default.

Query types that work on an entry-basis do not need to specify a database as all databases are searched for those query types.

Queries

Status (GET)

/status

Returns the current status of the databases and server. This includes the number of entries in each database, the number of chemical shifts in each database, and the last time each database was updated. The available methods are also returned, as well as the version number of the API.

Link

List entries (GET)

/list_entries[?database=$database]

Returns a list of all entries.

Example: List all entries

Example: List macromolecule entries

Example: List metabolomics entries

Example: List chemcomp entries

Store entry (POST)

/entry/

When you access this URI you must also provide a NMR-STAR entry in text or JSON format as the body of the request. The entry will be parsed and stored in the database. You can then use all of the entry-based queries below on your saved entry. The response to this request will include two keys:

Caution - Data you upload to the server is publicly accessible to anyone with access to the assigned entry_id you are provided. Therefore you should not share this key with anyone who you do not intend to share the data with.

Retrieve entry (GET)

/entry/$entry_id[?format=$entry_format]

By default returns the given BMRB entry in JSON format. If entry_format is specified then return in that format instead.

The formats available are:

Retrieve one or more saveframes by category (GET)

/entry/$entry_id?saveframe_category=$saveframe_category[&format=$entry_format]

Returns all saveframes of the given category for an entry in JSON format by default. If entry_format is specified then return in that format instead.

Only json and nmrstar are currently allowed for entry_format. These formats are the same as described in the entry method.

You may provide the URL parameter saveframe_category=saveframe_category multiple times to retrieve multiple saveframes.

Example: Querying for the entry information saveframe

Example: Querying for the entry information and citation saveframes

Retrieve one or more saveframes by name (GET)

/entry/$entry_id?saveframe_name=$saveframe_name[&format=$entry_format]

Returns the saveframe with the given name from an entry in JSON format by default. If entry_format is specified then return in that format instead.

Only json and nmrstar are currently allowed for entry_format. These formats are the same as described in the entry method.

You may provide the URL parameter saveframe_name=saveframe_name multiple times to retrieve multiple saveframes.

Example: Querying for the entry information saveframe by name

Example: Querying for the saveframes with the name entry_information and citation_1

Retrieve one or more loops (GET)

/entry/$entry_id?loop=$loop_category[&format=$entry_format]

Returns all loops of a given category for a given entry in JSON format by default. If entry_format is specified then return in that format instead.

Only json and nmrstar are currently allowed for entry_format. These formats are the same as described in the entry method.

You may provide the URL parameter loop=loop_category multiple times to retrieve multiple saveframes.

Example: Query of the entry author loop

Example: Query of the entry author loop and the sample component loop

Retreive one or more tags (GET)

/entry/$entry_id?tag=$tag

Returns tags of the specified type(s) for a given entry.

Example: Fetching the entry title

Example: Fetching the entry title and citation title

Fetch the citation information for the entry (GET)

/entry/$entry_id/citation

Returns the citation information for the entry. Citation information is available in three formats. The default format is bibtex. To use one of the other formats, specify one of the following values for the format tag:

Examples:

Fetch information on the NMR experiments (GET)

/entry/$entry_id/experiments

Returns information about the NMR experiments for an entry. The information returned:

Example: Experiments for entry bmse000001

Get tag enumerations (GET)

/enumerations/$tag_name[?term=$search_term]

Returns a list of values suggested for the tag in the values key if there are saved enumerations for the tag. In the type key one of the following values will appear:

Example: List of common NMR-STAR versions

You can narrow the results to those starting with the value you provide in the term parameter in the query string. This will return the results in a form that can be used by JQuery's auto-complete.

Example: List of common NMR-STAR versions starting with 2

Instant search (GET)

/instant?term=$search_term[&database=$database]

This URL powers the BMRB instant search tool. It queries all macromolecule and metabolomics entries based on a variety of commonly searched fields. It does exact searches on certain fields and fuzzy-matches on others depending on what is most appropriate for the field (for example, database matches must be exact but InChI matches may be similar). It returns matches sorted by what results it thinks are the most relevant. It should always begin sending results within 1 second to allow you to use it in interactive applications. A non-exhaustive list of the search fields:

You can use this endpoint to do a "general search" against the entire BMBR archive. Example link for "john markley mouse". It will return results that can be used by JQuery auto-complete with some additional fields provided. This means it returns a list of dictionaries, each of which corresponds to one matching entry. Entries are only listed once even if multiple fields matches. The entry dictionaries will always contain the following keys:

If the search matched one of the "additional" search fields (any field other than ID, Title, Author and Citation) the key extra will also exist and point to another dictionary. That dictionary contains the following two keys:

Furthermore, if you perform a query against only the metabolomics database, the following values will also be returned:

Get assigned chemical shift list (GET)

/search/chemical_shifts[?database=$database][...]

Returns all of the chemical shifts in the BMRB for the specified atom type. You can omit the atom type to fetch all chemical shifts and you can use * as a wild card character. Optionally specify macromolecule or metabolomics for the database argument to search a specific database. macromolecule is the default.

In addition, the following parameters can be provided using the standard notation to limit the set of results to those that match the search parameters. All provided search parameters are combined with a logical AND.

Examples:

Perform a FASTA search (GET)

/serach/fasta/$sequence[?type=rna|dna|polymer][&e_val=$expectation_val]

Returns a list of FASTA matches from the BMRB database for the given query string.

Parameters:

Search for matching entries based on a lift of shifts (GET)

/search/multiple_shift_search?shift=x.x[&shift=x.x][...][&database=$database]

Returns all entries that contain at least one of the queried shifts, as well as the list of shifts that matched. Results returned as a list of matching entries along with the matching shifts, solvent(s) in which the shifts were observed, number of shifts matched, and total offset of shifts, sorted by number of shifts matched and total offset.

The titles and links to the matched entries are also returned. Note that for a large number of peaks, you should use 's' rather than 'shift' in order to save free up extra characters in the URL.

Parameters:

Example: Search for peaks 2.075, 3.11, and 39.31

Get entries with tag matching value (GET)

/search/get_id_by_tag_value/$tag_name/$tag_value[?database=$database]

Returns a list of BMRB entry IDs which contain the specified tag_value for the value of at least one instance of tag tag_name. The search is done case-insensitively. You may optionally specify a database if you want to query the metabolomics or chemcomp database rather than the macromolecule one.

Example: All entries which used solid-state NMR

Note that you need the proper tag capitalization for this method. Use the dictionary for reference.

Get all values for a given tag (GET)

/search/get_all_values_for_tag/$tag_name[?database=$database]

Returns a dictionary for the specified dictionary where the keys are entry IDs and the values are lists of all of the values of the given tag in each entry. This allows you to get all of the values of a given tag in the BMRB archive for a given database.

Example: The citation titles for all entries in the macromolecule database

Example: The compound names for all compounds in the metabolomics database

Note that you need the proper tag capitalization for this method. Use the dictionary for reference.

Get associated PDB IDs for a given BMRB ID (GET)

/search/get_pdb_ids_from_bmrb_id/$bmrb_id

Returns a list of dictionaries, each containing three keys corresponding to PDB IDs associated with the specified BMRB ID, the association between the two, and any notes on the relationship if present.

The keys are pdb_id, match_type, and comment.

The pdb_id field will contain the PDB ID of the match.

The following match types are possible for match_type:

The comment field will only be present if it has a non-null value. It will contain any recorded notes on how the specific PDB ID is related to the queried BMRB ID if present.

Example: PDB IDs associated with BMRB ID 15000

Get associated BMRB IDs for a given PDB ID (GET)

/search/get_bmrb_ids_from_pdb_id/$pdb_id

Returns a list of dictionaries, each containing three keys corresponding to BMRB IDs associated with the specified PDB ID, the association between the two, and any notes on the relationship if present.

The keys are bmrb_id, match_type, and comment.

The bmrb_id field will contain the BMRB ID of the match.

The following match types are possible for match_type:

The comment field will usually be null, but if not, it will contain any recorded notes on how the specific BMRB ID is related to the queried PDB ID.

Example: BMRB IDs associated with PDB ID 2JM0

Bulk Mappings

Get a bulk BMRB<->PDB ID mapping

/mappings/bmrb/pdb[?format=$format][&match_type=$match_type]

/mappings/pdb/bmrb[?format=$format][&match_type=$match_type]

Returns a mapping of BMRB ID<->PDB ID.

Parameters:

Examples:

Get a bulk BMRB<->UniProt ID mapping

Returns a mapping of BMRB ID<->UniProt ID.

/mappings/bmrb/uniprot[?format=$format][&match_type=$match_type]

/mappings/uniprot/bmrb[?format=$format][&match_type=$match_type]

Returns a mapping of BMRB ID<->UniProt ID or PDB ID.

Parameters:

Examples:

Software

Software summary (GET)

/software/

Returns a summary of all software packages used in BMRB entries.

Example: All software packages used

Software used in an entry (GET)

/entry/$entry_id/software

Returns a list of all software packages used by a given entry. Each item in the list of software will be a list with the following four values in order:

Example for entry 15000

Which entries used a given software package (GET)

/software/package/$software_package/[?database=$database]

Returns a list of all entries used by the specified software package. The search is done case-insensitive and does not require perfect matches. For example, SPARK would match SPARKY and NMRFAM_SPARY.

You may optionally specify which database to use.

Example: Entries using SPARKY

MolProbity

Get one-line MolProbity results for a PDB ID (GET)

/molprobity/$pdb_id/oneline

Returns the full one-line MolProbity results for the given PDB ID.

Example: PDB 2DOG

Get residue MolProbity results for a PDB ID (GET)

/molprobity/$pdb_id/residue[?r=$residue][&r=$residue][...]

Returns the full MolProbity residue results for the given PDB ID. You may optionally specify a list of residues to only get results for those residues.

Parameters:

Example: PDB 2DOG residues 10-13

Protein-oriented endpoints

Get information about BMRB records for one or all UniProt IDs

/protein/uniprot[/$uniprot_id]

Provides detailed information on linked BMRB IDs for the provided UniProt ID, or for all UniProt IDs. Information available in either json or hupo-psi-json format.

Parameters:

Examples: