geneontology / pathways2GO

Code for converting between BioPAX pathways and Gene Ontology Causal Activity Models (GO-CAM)
8 stars 0 forks source link

Reactome: are these modifications redundant? #151

Closed nataled closed 2 years ago

nataled commented 2 years ago

There are several ways I've seen GPI-anchored proteins (GPI-APs) represented in Reactome. I'm still trying to figure out a few of them, but the following cases stood out to me, and might have a simple fix that could represent one step toward standardization.

The GPI anchor for these is represented by MOD:00818 "glycosylphosphatidylinositolated residue"

R-HSA-6808768
R-HSA-6807756

The GPI anchor for these is represented by both MOD:00818 "glycosylphosphatidylinositolated residue" plus CHEBI:24410 "glycosylphosphatidylinositol".

R-HSA-5362430
R-HSA-6808772
R-HSA-5689453
R-HSA-5689808
R-HSA-5689804
R-HSA-6808775
R-HSA-204088
R-HSA-6808777
R-HSA-5689457
R-HSA-204085
R-HSA-6808779
R-HSA-5689801

To my mind, the second set seems to have a somewhat redundant specification. That being said, all the other GPI variants have both a MOD and CHEBI (in general I understand why this is done, I think). I don't know which is preferred (talking about the above only), but they clearly should all be represented the same way.

deustp01 commented 2 years ago

This is a long story. The short answer is yes, but we don't see how to avoid the redundancy / duplication. If you see a way, including an argument that our quest for "complete" information isn't really useful here, that is something to discuss!

The history, as best I remember it, is that when we originally set out to annotate the process by which proteins synthesized on membrane-bound ribosomes are glycosylated as they move through the Golgi apparatus and ER on their way to localization in the plasma membrane or secretion into extracellular space, we discovered that there were not psiMOD terms for the many diverse oligosaccharide entities needed to create modified residue instances for proteins involved in this process, and creating all of them wasn't really in scope for psiMOD, especially given limited resources. We worked out a compromise: psiMOD had or was able to create terms for all the monosaccharide groups directly attached to an amino acid side chain, and ChEBI was willing to create chemical group instances for each of the complete oligosaccharide groups we needed, including all monosaccharide residues and correct branching structures. For the first step in glycosylation, we just use the psiMod instance to create an EWAS with conventional modifiedResidue annotation. For the second and additional steps, we create a groupModifiedResidue instance which has a name that describes the entire oligosaccharide but that has two attributes, the psiMod term for the first saccharide directly attached to the protein and the ChEBI term for the whole oligosaccharide including that first residue, so the first residue gets double-counted.

It's a hack, but we didn't see a workaround that preserves psiMod as the definitive reference resource for covalent modifications of the side chains of amino acid residues of proteins and also allows us to distinguish among glyco-modifications that are beyond the scope of psiMod.

Here is an example - GalNAc-MUC1(24-1255) [Golgi lumen] is glycosylated further to yield T-antigen (MUC1) [Golgi lumen]. Note that the modifiedResidue attribute of the first EWAS has become a groupModifiedResidue attribute. Further, if you drill into the groupModifiedResidue instance you see that it is composed of a psiMod instance plus a ChEBI instance. (The URLs point to to the instancebrowser view of our internal website and are useful for seeing the full frame-and-slot view of instances.)

nataled commented 2 years ago

I think the suggestions given in #153 will solve this issue. In short, the solution will likely be 'none of the above' (with respect to which version of these MOD:00818 cases to align to). Re-open if you like, but I think this can be closed since final action will depend on the discussion there, and the fix will be implemented with that ticket.

deustp01 commented 2 years ago

Agreed - sort out #153 and that will resolve the issues here