geneontology / pathways2GO

Code for converting between BioPAX pathways and Gene Ontology Causal Activity Models (GO-CAM)
8 stars 0 forks source link

inferring enabled by from upstream #103

Closed goodb closed 4 years ago

goodb commented 4 years ago

We had the rule: If one of the inputs to a Binding a reaction is a protein or a complex, and that input is also an output of an immediately upstream reaction, Then change the relation from that reaction to the entity from ‘has input’ to ‘enabled by’.

Now we have no Binding reactions. All of these are left as Molecular Events or, if they have enablers, Molecular Functions. How does this impact this rule? In the conversions that are live on noctua-dev (as of Sept.9 , 2020), the rule was adapted to apply to any reaction typed as either Molecular Event or Molecular Function. Note that if this rule applied to a Molecular Event, a secondary rule would then convert the Event to a Function based on the presence of the enabler.

Here is an example. The reaction 'glucokinase (GCK1) + glucokinase regulatory protein (GKRP) <=> GCK1:GKRP complex' has an enabled by GCK assertion because GCK was an input, and GCK was the output of the immediately upstream reaction 'glucokinase [nucleoplasm] => glucokinase [cytosol]' . Based on the assertion of the enabled by, the reaction was changed from a Molecular Event to a Molecular Function. http://noctua-dev.berkeleybop.org/editor/graph/gomodel:R-HSA-170822 Screen Shot 2020-09-09 at 2 41 50 PM

Is this what we want reactions like 'glucokinase (GCK1) + glucokinase regulatory protein (GKRP) <=> GCK1:GKRP complex' to look like when converted?

@cmungall @thomaspd this examples shows a lot of the consequences of the change to Molecular Event and the elimination of the binding inferences.

goodb commented 4 years ago

@ukemi this was your rule originally, what is the call for the manuscript version of the conversion? Keep it or take it out because we took out the Binding inferences? Current status is that it is in place. Lets make a call and move on so we can freeze the data and finish the paper.

There is a comment thread about it in the supplementary data file. https://docs.google.com/document/d/16-pFFG5MuYY3vDPXeacSBcAv5KB8WxQCE2S_9Ha4K-E/edit?ts=5f528713#

ukemi commented 4 years ago

If we took out the binding inferences, then I think we remove this rule. It was put into place to invoke a directionality to binding reactions.

ukemi commented 4 years ago

So these will become molecular events because they have no enabler now? Just making sure I understand what's happening.

goodb commented 4 years ago

@ukemi yes, that would be the consequence.

deustp01 commented 4 years ago

CAUTION - possible tangent And this apparent loss of information is OK because it is coupled, at least in our minds, to a plan to figure out how to extract the enabler information that is now hidden in binding reactions (even if just in the minds of curators as they create these reactions and figure out how to order them) and turn it into proper reaction attributes that will support association of a reaction with a specific molecular function term as well as a justification for the placement of the reaction in the activity flow of its process. (I.e., we can order it now because the curator says so, but if enablers were reliably identified, they would provide an additional basis for ordering and support an assertion more specific that "causally upstream of".)

ukemi commented 4 years ago

Not really a tangent. I think this works only for some cases. In some cases the enabler comes from an upstream reaction, but in some cases like a metabolic pathway it is just present in the cell in the right place and the right time and the correct substrates are generated by the upstream event. I think it would have worked for the binding inferences.

ukemi commented 4 years ago

But more to the point, yes I think based on the current approach this loss of information is ok.

goodb commented 4 years ago

To be complete here, if we drop the inference of enablers from the upstream rule, the consequence for the above reaction will be that it will now have two inputs (GCK, GCKR), no enabler, and be typed as a Molecular Event. We will also see 'directly positively regulates' change to 'causally upstream of'. This will diminish the number of 'complete' activities in the counts for the paper.

If no objections, I will drop the rule and close the ticket.

goodb commented 4 years ago

@ukemi FYI, taking this rule out has a pretty large impact on the causal relation counts because we lose the inference of positive regulation. With the rule in place, we end up with about 17% of the relations as 'causally upstream of' and 37% 'directly positively regulates'. Without the rule, that shifts to 46% causally upstream of and 7% 'directly positively regulates'.

In the case above, the relation actually shifts to 'provides input for' which had a lower priority than the positive regulation rule that is now inactive. 'provides input for' also increases in proportion but only slightly (44% to 46%). (note I'm using percentages because this batch also picked up a lot of new causal relationships based on adding in immediately upstream reactions from other pathways so the counts aren't directly comparable https://github.com/geneontology/pathways2GO/issues/104 )

Here is what the example above looks like now.

Screen Shot 2020-09-15 at 9 55 41 AM
goodb commented 4 years ago

shame to lose all of those inferences of the direction of causality but this is it unless we go back on a number of other decisions. closing.