NCATS-Tangerine / beacon-ontology

Standalone shared library for software integration of Translator concept and predicate ontology into various applications
0 stars 0 forks source link

Mappings of Wikidata Properties to Biolink #2

Open RichardBruskiewich opened 6 years ago

RichardBruskiewich commented 6 years ago

Translator (and Knowledge Beacons) are striving to standardize on the Biolink Model and on a common set of predicates. We need to somehow map beacon-specific predicates onto this emerging list.

As a case study, the "Reference Beacon" is a prototype beacon which wrapped the "legacy" Knowledge.Bio Release 3.0 Neo4j database which mainly contained content from the NIH Semantic Medline Database and a small drug-disease data set. The Release 3.0 standardized concepts and predicates around Wikidata (entity and property) concept identifiers.

Thus, the /predicates call of this beacon (https://rkb.ncats.io/swagger-ui.html#!/metadata/getPredicatesUsingGET) returns a modest list of Wikidata properties (see below). We need to map these properties onto the Translator Biolink (RO? SIO?) predicates list.

wd:P3356,positive diagnostic predictor wd:P129,physically interacts with wd:P279,subclass of wd:P276,location wd:P1557,manifestation of wd:P361,part of wd:P156,followed by wd:P1056,product or material produced wd:P2888,exact match wd:P2175,medical condition treated wd:P2283,uses wd:P1542,has effect kb:P2176,drug used for treatment wd:P703,found in taxon wd:P688,encodes wd:P684,ortholog wd:P682,biological process wd:P681,cell component wd:P680,molecular function wd:P3433,biological variant of wd:P2293,genetic association wd:P1552,has quality wd:P128,regulates (molecular biology)

mbrush commented 6 years ago

Does it make sense to add these to the predicate alignment spreadsheet here: https://docs.google.com/spreadsheets/d/1zXitcR1QjHyh6WocukgshSR7IoAVg7MJQG-HNh96Jec/edit#gid=3366698

This would illustrate how they map to the predicates used natively in the reasoner KGs and the 'standardized' predicates we have defined so far. It would keep all predicate standardization work in the spreadsheet for now, which would serve as the single source of for KGs implementation efforts for now. Once the predicate spec is tested in the KGs and refined as needed, we can implement in Biolink and obsolete the spreadsheet.

If this makes sense, I can take a pass at integrating this list of Wikidata predicates into the spreadsheet. I suspect that there will be several that don’t map to any of the existing predicates and will require new ones to be created.

RichardBruskiewich commented 6 years ago

Hi Matt,

Sure... sounds reasonable to add these to the master predicate sheet. If you don't mind (since you are the "expert" of the sheet), please do go ahead and take a pass at integrating them.

I think Ben Good originally mapped these onto the Semantic Medline Database and other drug-disease data in our Knowledge.Bio iteration that predated our involvement in Translator, but I guess they all look pretty legitimate to the concerns of Translator.

Our beacon software is already Biolink Model aware, so we look forward to full integration.

mbrush commented 6 years ago

some progress on mapping wd properties to blm . . . https://docs.google.com/spreadsheets/d/1rFEeBv1nyx95TSyi3YxwG8KKcoI1kgPXwRtm7rMaeDw/edit#gid=1473124732