geneontology / pathways2GO

Code for converting between BioPAX pathways and Gene Ontology Causal Activity Models (GO-CAM)
8 stars 0 forks source link

Reactome: EWASs with wrong modified residues #220

Open deustp01 opened 1 year ago

deustp01 commented 1 year ago

Darren Natale has generated a list of impossible(?) modified residue instances, e.g., phosphorylation of a leucine side chain. Examine and resolve so PRO IDs can be assigned to Reactome EWASs and this component of the REACTO ontology can be retired.

deustp01 commented 1 year ago

Notes are in this Google Doc

deustp01 commented 1 year ago

Progress - most issues resolved, tracked in the Google Doc linked to the Dec 11, 2022 comment.

Meanwhile, an extension of this issue: the drift in UniProt and Reactome curation mistakes that gave rise to the mismatched posttranslationally modified residues addressed here, should also give rise to mismatched genetically modified residue instances as well. These are the instances created to support annotation of reaction in which a protein whose sequence differs from the UniProt canonical one due to a germ line or somatic mutation functions abnormally. @nataled would it be possible to apply the same strategy that detected the impossible chemical modifications to identify the cases where we've gone wrong, e.g., where the assertion that Ala123Pro ABC1 is impossible because, according to UniProt ABC1 residue 123 is not alanine?

This is mostly outside the current scope of pathways2GO because we're only generating GO-CAM models that require PRO IDs for normal reactions, but it's an obvious future extension and it would be good to get this part of Reactome aligned with PRO now to support it.

nataled commented 1 year ago

This is already in the pipeline. Though not needed for Pathways2GO, I still process the sequence variants because doing so comes with no extra cost, which is to say my scripts don't know if the MOD identifiers indicated are for PTMs or variants.