ExposuresProvider / cam-pipeline

Data loading pipeline for CAM database
https://exposuresprovider.github.io/cam-pipeline/
MIT License
2 stars 4 forks source link

Add support for Biolink 3 and qualifiers #82

Open gaurav opened 1 year ago

gaurav commented 1 year ago

Changing cam-pipeline to support Biolink 3 should be relatively straightforward, since we can just use the property mappings that are already included in biolink-local.ttl.

I think our plan for adding support for qualifiers is:

  1. Get cam-pipeline working on its current Biolink version (v2.1.0) and make sure that works correctly. Push this all the way through to ITRB-prod.
  2. Get cam-pipeline working on the latest Biolink version (v3.1.0 as of today), and make sure that works correctly.
  3. A whole bunch of queries should just stop working at this point, since a number of relations in the triplestore will no longer map to a Biolink predicate. For example, increases expression of in Biolink v2.4.8 maps to RO:0003003 "increases expression of", but nothing maps to this predicate in Biolink v3.1.0. You can see lists of deprecated properties in the Biolink 3.0 migration guide.
    • Ideally we'd want to do this mapping in the triplestore, but since we need to map an RDF predicate to a Biolink predicate + qualified_predicate + object_aspect + object_direction (see this PDF for some examples).
    • A really annoying aspect of this is that since these mappings are no longer in the Biolink model, we will need to maintain our own set of mappings unless we can convince the Biolink model to incorporate that information into their repository elsewhere.
  4. Develop some queries with qualifiers in cam-kp-api. That will tell us if we can handle qualifiers completely in cam-kp-api (as part of https://github.com/ExposuresProvider/cam-kp-api/issues/549), or if we need to incorporate them into cam-pipeline as well.
    • Once we have step 3 working above, we should be able to run the mapping code in reverse to map from Biolink predicate + qualified_predicate + object_aspect + object_direction to an RDF predicate.