geneontology / pathways2GO

Code for converting between BioPAX pathways and Gene Ontology Causal Activity Models (GO-CAM)
8 stars 0 forks source link

Translate complexes for SGD pathways #154

Closed dustine32 closed 1 year ago

dustine32 commented 2 years ago

Currently for non-Reactome pathways, reaction controller entities of type Complex are simply translated as a GO:0032991 "protein-containing complex" individual: https://github.com/geneontology/pathways2GO/blob/97fe498080ddba4339c233ffffe109c15431209c/exchange/src/main/java/org/geneontology/gocam/exchange/BioPaxtoGO.java#L1525-L1526

For SGD pathways, we should be able to pull out the component proteins and attach them to this GO:0032991 individual with has_part edges. Proposed example (from SO4ASSIM-PWY): image

An extra step is required to fetch the correct component protein SGD ID given the ID in the BioPAX. For example, the ID in BioPAX for MET5 is "MONOMER3O-22," which isn't in our SGD lookup file. We'll need to supply another lookup file and, as @suzialeksander discovered, these can be pulled from PANTHER. I will create this "MONOMER##-## -> SGD:ID" lookup file from current PANTHER data.

Tagging @thomaspd to ensure these requirements look right.

nataled commented 2 years ago

A suggestion here is to use 'has component' (RO:0002180) instead of 'has part'. This will make the description consistent with the standard used by the Protein Ontology, and will allow for specification of stoichiometry when needed.

dustine32 commented 2 years ago

Good suggestion @nataled! Checking with @thomaspd to see if 'has component' should replace 'has part' here.

One existing blocker is that 'has component' is not encoded in our ShEx spec, which we use to control allowable relations in GO-CAM. We can look at adding it if we need to. Tagging @vanaukenk on this part in case there's already a good reason to exclude it.

dustine32 commented 1 year ago

Closing since the fix has been merged for a while and running in the latest batch of YeastPathways models.