ExposuresProvider / cam-pipeline

Data loading pipeline for CAM database
https://exposuresprovider.github.io/cam-pipeline/
MIT License
2 stars 4 forks source link

Add qualifiers #145

Open gaurav opened 1 month ago

gaurav commented 1 month ago

We currently don't produce any qualifiers for CAM-KP. Having reviewed the list of Biolink predicates, we decided to focus on these predicates in this order:

  1. direction_qualifier + aspect_qualifier + qualified_predicate for indicating increased/decreased abundance/expression.
  2. anatomical_context_qualifier for the location where reactions take place.
  3. species_context_qualifier for the species in which reactions have been observed.
  4. stage_qualifier - we have a few of these in GO-CAMs.
  5. part_qualifier - maybe?
    1. Look in IDs that can't be normalized
    2. We might have part_of relation connecting concepts, e.g. "A occurs in some plasma membrane that is part of a liver cell"
  6. form_or_variant_qualifier - CTD might have this.
  7. causal_mechanism_qualifier - maybe.

Implementing this will make https://github.com/ExposuresProvider/cam-pipeline/issues/82 redundant.

gaurav commented 2 weeks ago

I'm going to start with anatomical_context_qualifier since it should be relatively straightforward to implement in our pipeline if I can find the right query to find them. A lot of SYNGO models (e.g. http://model.geneontology.org/SYNGO_102#inferred), one GO model (http://model.geneontology.org/586fc17a00000528#inferred) and one AOP model (https://noctua.apps.renci.org/model/AOP_363#inferred) all reference UBERON:0000966 "retina", so this is probably a good place to start.