ExposuresProvider / cam-pipeline

Data loading pipeline for CAM database
https://exposuresprovider.github.io/cam-pipeline/
MIT License
2 stars 4 forks source link

Add tests for improving RO-to-Biolink mappings #96

Open gaurav opened 1 year ago

gaurav commented 1 year ago

Here are two ways of doing this:

  1. We can look for cases where an RO predicate in an edge could not be directly mapped to a Biolink predicate. Generally this will be because it's too specific, and the more general mapping will be just fine, but making a list of these predicates could be useful in informing future improvements to the Biolink model.
  2. Because cam-pipeline removes redundant triples, we should never see e.g. biolink:related_to, because we should always see a more specific predicate instead. So, a high-level predicate suggests that better predicates do not exist for the RO term.

These can probably be generated as debugging output from the Souffle scripts.