geneontology / pathways2GO

Code for converting between BioPAX pathways and Gene Ontology Causal Activity Models (GO-CAM)
8 stars 0 forks source link

Representation questions (more to add later; didn't want to lose the notes) #228

Open nataled opened 1 year ago

nataled commented 1 year ago

A few cases have raised questions as to what the representation means:

R-HSA-5682069 "4xPALM-C-p-2S-ABCA1 [plasma membrane]" Of interest are the phosphorylated serines. Unlike most cases--which represent phosphoserine using simply MOD:00046 "O-phospho-L-serine", this one uses that MOD plus CHEBI:35780 "phosphate ion". To me this looks like it's a long-hand way of saying the same thing as just the MOD by itself. Please confirm, or let me know if it means something different. Relevant links: https://www.reactome.org/content/detail/R-HSA-5682069 & https://reactome.org/content/schema/instance/browser/5682069

R-HSA-9641110 " Ub-Misfolded CETN1 [cytosol]" represented as MOD:01148 + R-HSA-450143 "K63-Ub [cytosol]" R-HSA-9641126 "PolyUb-Misfolded CETN1 [cytosol]" represented as MOD:01148 + R-HSA-450152 "K63polyUb [cytosol]" On the surface (based on names) it looks like one could represent a single ubiquitin while the other is polyubiquitin. Arguments against that interpretation are two-fold: (1) Names are quite inconsistent here. EWASes invoking R-HSA-450152 can be prefixed as K63polyUb-, K63Ub-, Ub- (same prefix as all the R-HSA-450143 cases), PolyUb-, and even just p-ub-. (2) One would only ever need to specify K63 if there is at least one other ubiquitin chain connected to the first (at that lysine). Thus, basically by definition, these should all be polyubiquitin. If the intention is that the R-HSA-450143 cases should be monoubiquitinated, R-HSA-3200014 would be more appropriate). If the intention is K63-polyubiquitinated, then R-HSA-450152 is more appropriate. Finally, if no judgment is made regarding number of chains, R-HSA-113595 would be the appropriate choice. Please advise as to the correct interpretation.