metadatacenter / cedar-terminology-server

A wrapper for the BioPortal API that simplifies the access to BioPortal ontologies, classes, value sets and values
Other
2 stars 2 forks source link

Remove necessity for ontology and value set caching #28

Open marcosmro opened 7 years ago

marcosmro commented 7 years ago

The first time that the terminology server is started, it builds a cache of BioPortal ontologies and value sets. Building this cache implies making one call to BioPortal retrieve all ontologies/value sets, plus two calls per ontology/value set to retrieve details such as name, acronym, description, and size.

We should remove the necessity of the ontologies/value sets cache in CEDAR, by extending BioPortal with at least the following endpoints:

  1. Get all ontologies. Each ontology should include: id (acronym), name, isFlat?, description, number of classes, number of properties.
  2. Get all value sets in BioPortal (i.e., from all value set collections). Each value set should include: id (acronym), name, description, number of values.
  3. Get all value sets for a single value set collection (e.g., CEDARVS). Each value set should include: id (acronym), name, description, number of values.
  4. Ontology search. Search for ontologies by name, including the "suggest" option already available for term search (http://data.bioontology.org/documentation#nav_search).
  5. Value set search. Search for value sets by name, including the "suggest" option. Note that when searching for value sets, CEDAR leverages the value sets cache to know if a particular result returned by BioPortal is a value or a value set. This information should be included in the BioPortal search results.
martinjoconnor commented 7 years ago

It would be very nice to get rid of this caching in CEDAR.

The cache startup also increases the startup time of the Docker installation.

We should chat with @mdorf to get a sense of the effort involved.

martinjoconnor commented 7 years ago

See also 12-Factor review: https://github.com/metadatacenter/cedar-project/issues/490

marcosmro commented 6 years ago

The endpoints 1 and 2 are planned for 1.6.

marcosmro commented 6 years ago

This issue has been decomposed in 5 different tasks:

  1. Get all ontologies: https://github.com/metadatacenter/cedar-project/issues/660
  2. Get all value sets in BioPortal: https://github.com/metadatacenter/cedar-project/issues/661
  3. Get all value sets for a single value set collection: https://github.com/metadatacenter/cedar-project/issues/662
  4. Ontology search: https://github.com/metadatacenter/cedar-project/issues/663
  5. Value set search: https://github.com/metadatacenter/cedar-project/issues/664
marcosmro commented 3 years ago

I'm reopening this issue because this task is not done yet. The BioPortal endpoint to retrieve all the relevant information is implemented (see https://github.com/metadatacenter/cedar-project/issues/660), but we still need to integrate it into CEDAR