NCATS-Tangerine / translator-knowledge-graph

Prototype of Translator-wide shared central knowledge graph. (Obsolete as of 2019)
MIT License
6 stars 0 forks source link
obsolete-translator

Translator Knowledge Graph

Practical experience with graph algorithms suggest that "just-in-time" mining of distributed knowledge sources is sometimes non-performant and that central warehousing of knowledge concepts and relationships is desirable. Also, several Translator architecture and reasoner teams currently use graph databases (Neo4j, RDF triple stores (e.g. Wikidata), custom graph stores (e.g. Minikanren file store) for accessing, persisting and annotating knowledge subgraphs (concept nodes and edges) retrieved by queries. The Translator team have therefore recently converged on a proposal to develop a Translator-wide shared standards-driven knowledge graph.

This project serves as a hub for the collaboration, design and prototyping of such a Translator Knowledge Graph (TKG) resource. In addition to this README, we are using the project repository wiki to document design discussions and cross link with resources.

Getting Started

We will provide links to various Translator efforts supporting the goal of creating a shared Translator knowledge graph platform, but will also attempt here to provide one or more reference implementations to drive the learning process of creating such a resource.

Standards Feeding into the TKG

The design of the TKG can be seen at three levels:

  1. Ontology Standards: common concept types and predicates. The Biolink Model is proposed to coordinate these standards. A CSV of a subset of Translator-specific excerpts of the model are here. Discussions about predicates (coordinated by Matt Brush) are summarized in a worksheet here.
  2. Data Model: referring to the general schema to represent knowledge graph nodes and edges. At the moment, we are focused on a Neo4j specification of node labels and property tags (see discussion) but this could be generalized to other formats such as [RDF triple stores](). Once again, the Biolink Model of classes and slots may translate into such a specification.
  3. Input/Output Exchange Formats: for importing and exporting knowledge (sub)graphs to/from the TKG. A JSON-LD draft specification for standardized Reasoning Tool API responses that was initiated by members of the Translator community, may be a worthy starting point.

Exploratory Software Implementations