geneontology / go-shapes

Schema for Gene Ontology Causal Activity Models defined using RDF Shapes
2 stars 0 forks source link

Handling Molecular Event - discussion #232

Open goodb opened 3 years ago

goodb commented 3 years ago

As a result of https://github.com/geneontology/pathways2GO/issues/98 , I have introduced a new class into the REACTO ontology called 'Molecular Event'. It is a subclass of BFO Process and could, in principle, be a superclass of GO Molecular Function. This is used to describe biological activities that may or may not have enabling gene product entities. If they do have enablers, then these would be reclassified as GO Molecular Functions. If not, then a new class of spontaneous processes may be needed. This class may wind its way into the GO.

We need to decide if we want to make any changes to the shex schema based on its addition. Fundamentally, do we treat molecular events differently to molecular functions or not and how. If we do nothing, the only immediate consequence would be that many models imported from Reactome would fail schema validation. That could be a desired outcome, but if it is not, we need to change the schema. (e.g. constraints on the MF shape that 'provides input for' or 'regulates' can only link to another MF would break when the downstream node is typed as a ME).

@cmungall @thomaspd @ukemi @vanaukenk I'd love to have a quick answer here so I can run the validator on the reactome models and finish that paper.

goodb commented 3 years ago

After discussion with @cmungall the decision, for now, is to allow the use of Molecular Events in all the places where Molecular Function is now specified. This will be handled in the schema using union statements. In the future, the Molecular Event class may be used with the GO as part of an upper level ontology in between GO and BFO. When that happens, we can update the shex schema to take advantage of the structure there (e.g. pointing to ME and its children, which will contain MF).

ukemi commented 3 years ago

Hi @goodb Do you mean Molecular Function is not specified?

deustp01 commented 3 years ago

a new class of spontaneous processes may be needed

If the Reactome sample of events is representative, truly spontaneous activities / reactions are very rare (but there are examples - 1PYR-5COOH spontaneously hydrolyses to L-GluSS - so this class will hold permanent instances as well as a much larger number that lack enablers now because of lacking experimental data (chunks of eicosanoid metabolism) or because we don't yet know how to identify the enabler in a way that GO can accommodate (dissociation events).

deustp01 commented 3 years ago

the only immediate consequence would be that many models imported from Reactome would fail schema validation. That could be a desired outcome, but if it is not, we need to change the schema. (e.g. constraints on the MF shape that 'provides input for' or 'regulates' can only link to another MF would break when the downstream node is typed as a ME).

This directly relates to the discussion in the pathways2GO manuscript of the usability of incomplete GO-CAM models. From Ben's description, it seems like information is lost if a complete activity unit can't be functionally linked to an incomplete downstream one. If this understanding is correct then it looks like it could be an argument for changing the schema? (But if these "provides-input" and "regulates" links can be preserved even if the model fails validation, then maybe not. I'm out of my depth here.)

goodb commented 3 years ago

@deustp01 yes, information is lost if we take any nodes out of the network. We've discussed inferring connections across the gaps that would be created by taking those nodes out, but this is a bridge I see no need to cross. If those nodes had no meaning or use, they wouldn't be in Reactome.

@ukemi no, Molecular Function is still a specified shape and still used in the same way for the shape definitions. The proposed (see https://github.com/geneontology/go-shapes/commit/852c2548a9533b373987770bb09f8e1f0728e017 ) change is to add in the molecular event shape (currently with no constraints apart from being an instance of the ME class) and, where MF is used in a constraint, also allow ME.

goodb commented 3 years ago

de ja vu from March 2020 - https://github.com/geneontology/go-shapes/issues/214 - assuming this change is accepted, close that issue too.