geneontology / pathways2GO

Code for converting between BioPAX pathways and Gene Ontology Causal Activity Models (GO-CAM)
8 stars 0 forks source link

Refactor binding reactions so enablers become inputs #67

Closed ukemi closed 4 years ago

ukemi commented 5 years ago

As part of the GO-CAM specs, binding reactions will not be enabled-by either of the participants, instead all of the 'things' that bind will get has_input relationships to the binding term. We need to modify the rules in the import to comply with this. There are many regulatory binding reactions in glycolysis that can be used as a test.

goodb commented 5 years ago

@ukemi I think most, and perhaps all, of the cases where we have a 'binding' reaction enabled_by something arise from the rule developed to handle regulatory molecules. See https://github.com/geneontology/pathways2GO/issues/56 and slide 25 . All of the examples in glycolysis come from this.

It would be simple enough to remove the addition of the enabler to the binding reaction that is a consequent of the rule. Alternatively we could role back to another representation, e.g. molecule involved_in_regulation_of molecular_function . It strikes me as odd to have a binding function with only one input, as is the case most of the time here. What is that input binding to?

Please advise.

ukemi commented 5 years ago

Shouldn't the binding have two inputs? The first is the molecule that was initially the input, ATP, the second would be what is now the enabler, phosphofructokinase.

goodb commented 5 years ago

If that is how the biochemistry works, I can certainly make that happen in the translation. Below is just quick reminder of where this is coming from initially.

Screen Shot 2019-08-12 at 9 29 58 AM

Your suggestion is to make the GO-CAM representation for the 'positively regulated by ADP' statement be: type: Binding input: PFK tetramer input: ADP

Yes? If you are happy with that, I will make the change.

ukemi commented 5 years ago

I think that's the general pattern that we have decided to use for all binding in GO-CAM. @vanaukenk , @pgaudet ?

vanaukenk commented 5 years ago

Yes, I think that's what we said we'd like to do in GO-CAM models going forward.

Note that we'll need to translate this to conform to current annotation practice, i.e. gene products enable activities, for any GPAD outputs.

goodb commented 5 years ago

Out if curiosity here, do we really know that there is a physical binding relationship between these molecules in all cases where reactome indicates a regulation relationship?

ukemi commented 5 years ago

According to @deustp01 it was safe for us to assume this.

goodb commented 5 years ago

Okay, will make the change.

deustp01 commented 5 years ago

Still sounds OK from here. Just thought of one messy edge case: we have a subclass, gene expression regulation, in which an entity is asserted to regulate the expression (transcription and optionally also translation of the resulting mRNA). The logic here has the regulator binding the input gene when in reality it forms part of a multiprotein complex that binds to regulatory genomic DNA sequences adjacent to the open reading frame of the gene. People who work on these areas would find this imprecision offensive. I think it's OK, but @ukemi if this is going too far, then set that subclass aside for further discussion.

ukemi commented 4 years ago

The GO-CAM spec has reversed on itself. We now want enablers for binding reactions again.

goodb commented 4 years ago

@ukemi sorry just coming back to this. We currently have a rule in place that infers enablers for protein binding reactions (where the input is the output of the previous reaction). Confirming that the idea is to extend this inference to all kinds of binding reactions?

ukemi commented 4 years ago

Hi @goodb. I assume you mean the enabler is the output of the previous reaction? Yes, but only if the output fits the shex as an enabler of a molecular function. For example if a reaction has an output of a carbohydrate and the carbohydrate is used in the subsequent binding reaction, then the carbohydrate should not be the enabler of the binding reaction. This will cause a shex failure. We also need to stick with our current rule for generating regulation binding reactions like FPK binding ATP to negatively regulate its activity in glycolysis. Clear as mud? If so, we can take a look together.

We are still under discussion as to how we handle the cases where the first reaction has an output of an 'invalid' enabler and the second reaction is a binding. I have a straw-man proposal that I was going to run by @vanaukenk today for receptors where this is the case. We probably won't get to take a look before the call.

ukemi commented 4 years ago

Hello again. I noticed in the first comment of this ticket that I said we wanted to change the way we handled the regulatory binding functions too and not have enablers. This is incorrect. Basically we want to go back to what we had originally decided at the beginning of the project and allow enablers in binding reactions.

goodb commented 4 years ago

@ukemi I think this is solved. See screenshot below from http://noctua-dev.berkeleybop.org/editor/graph/gomodel:R-HSA-170822 . The GCK was an input and now is an enabler because of the rule.

Please re-open if I am mistaken.

Screen Shot 2020-04-19 at 9 42 24 AM