ExposuresProvider / cam-pipeline

Data loading pipeline for CAM database
https://exposuresprovider.github.io/cam-pipeline/
MIT License
2 stars 4 forks source link

How to handle multiple anatomical context qualifiers #164

Open gaurav opened 4 days ago

gaurav commented 4 days ago

@EvanDietzMorris has complained that having multiple qualifiers of the same type is not valid. Unfortunately, this is sometimes correct: for example, for model https://amigo.geneontology.org/amigo/model/66c7d41500001120, we emit the triple:

CHEBI:15428 biolink:affects GO:0017146  http://model.geneontology.org/6690711d00000596  infores:go-cam  (biolink:anatomical_context_qualifier=CL:0000540)&&(biolink:anatomical_context_qualifier=GO:0045211)

This is correct, because according to the model, the process linking CHEBI:15428 - glycine and GO:0017146 NMDA selective glutamate receptor complex occurs in the GO:0045211 "postsynaptic membrane" of the CL:0000540 "neuron".

Another thing that can happen (I think) is that we could determine:

A (takes place in X) --[affects]--> B (takes place in Y) --[affects]--> C

And I think the way in which we've set up reasoning right now, we'll end up inferring:

A --[affects]--> C (takes place in X and Y)

So we might need some way to fix that as well.

Possible solutions:

  1. We figure out how to add this to TRAPI/ORION.
  2. We add some reasoning so that we try to come up with one best answer for every property, e.g. in this case by reasoning that GO:0045211 is a part of the CL:0000540, so the latter statement is unnecessary.
gaurav commented 4 days ago

That's going to make things trickier when we mix different kinds of qualifiers, because then we could have a situation where we know a process has:

So then we'd need to split it up so that we get:

?s ?p ?o ?g subject_aspect_qualifier=abundance&subject_direction_qualifier=decreased&anatomical_context_qualifier=liver
?s ?p ?o ?g subject_aspect_qualifier=abundance&subject_direction_qualifier=decreased&anatomical_context_qualifier=cytoplasm