geneontology / go-ontology

Source ontology files for the Gene Ontology
http://geneontology.org/page/download-ontology
Creative Commons Attribution 4.0 International
221 stars 40 forks source link

Terms with RHEA in definition xref not in general xref #28937

Open pgaudet opened 1 month ago

pgaudet commented 1 month ago
GOID | Label & def | RHEA -- | -- | -- GO:0003835 | beta-galactoside alpha-2,6-sialyltransferase activity Catalysis of the reaction: CMP-N-acetylneuraminate + beta-D-galactoside = N-acetyl-alpha-neuraminyl-(2->6)-beta-D-galactosyl derivative + CMP + H+. [EC:2.4.3.1, | RHEA:11836 GO:0004035 | alkaline phosphatase activity Catalysis of the reaction: a phosphate monoester + H2O = an alcohol + phosphate, with an alkaline pH optimum. [EC:3.1.3.1, | RHEA:15017 GO:0008988 | rRNA (adenine-N6-)-methyltransferase activity Catalysis of the reaction: S-adenosyl-L-methionine + rRNA = S-adenosyl-L-homocysteine + rRNA containing N6-methyladenine. [ | RHEA:58728 GO:0051269 | alpha-ketoester reductase (NADPH) activity Catalysis of the reaction: alpha-ketoester + H+ + NADPH = (R)-hydroxy ester + NADP+. [PMID:15564669, | RHEA:80767 GO:0052654 | L-leucine-2-oxoglutarate transaminase activity Catalysis of the reaction: 2-oxoglutarate + L-leucine = 4-methyl-2-oxopentanoate + L-glutamatic acid. [ | RHEA:18321 GO:0141200 | UTP thiamine diphosphokinase activity Catalysis of the reaction: UTP + thiamine = UMP + thiamine diphosphate. [PMID:38547260, | RHEA:79423 GO:0141207 | peptide lactyltransferase (ATP-dependent) activity Catalysis of the reaction: lactate + ATP + L-lysyl-[protein] = N(6)-lactoyl-L-lysyl-[protein]+ AMP + diphosphate. Can also act on free lactate. [PMID:38512451, PMID:38653238, | RHEA:80271 GO:0141208 | protein lysine delactylase activity Catalysis of the reaction: H2O + N6-lactoyl-L-lysyl-[protein] + NAD = L-lysyl-[protein] + nicotinamide +2''-O-lactoyl-ADP-D-ribose, removing a lactoyl group attached to a lysine residue in a protein. [PMID:38512451, | RHEA:80287 GO:0141215 | N-acetyltaurine hydrolase activity Catalysis of the reaction: N-acetyltaurine + H2O = acetate + taurine. [PMID:39112712, | RHEA:81107 GO:0160090 | internal mRNA (guanine-N7-)-methyltransferase activity Catalysis of the reaction: a guanosine in mRNA + S-adenosyl-L-methionine = an N(7)-methylguanosine in mRNA + S-adenosyl-L-homocysteine. [PMID:31031084, PMID:37379838, | RHEA:60508
pgaudet commented 1 month ago

Hi @raymond91125 Can you take this ticket?

These terms have a RHEA in the definition xref that is not in the general cross ref. The task is to check that it seems right, and add it, if appropriate as skos:exactMatch.

Thanks, Pascale

raymond91125 commented 1 month ago

Sure. But the table has a few 'false positives' if I understand what you asked for correctly. @pgaudet

Below are action items:

raymond91125 commented 3 weeks ago

I tried using a python script (oaklib) to exhaustively search for RHEAs in definition but not in general xrefs. The script is incomplete but it did whittle down to 22 candidates. I manually checked them and found 3 valid cases that need attention.

pgaudet commented 3 weeks ago

Thanks for this @raymond91125 For GO:0090555 Did you see the comment in the parent term flippase activity

Nomenclature note. Flippases and floppases are ATP-dependent transbilayer lipid translocators. According to an extensively used, though not universal, nomenclature, they catalyze lipid transfer towards the inward monolayer (flippases) or towards the outward monolayer (floppases). Scramblases are ATP-independent, non-selective, inducing non-specific transbilayer movements across the membrane. The direction of the translocation should be taken into account for annotation (from the cytosolic to the exoplasmic leaftlet of a membrane).

So: since RHEA does not assign which side is in out out, these mappings cannot be exact WRT to the directionality of flippases and floppases. Therefore, I suggest we add RHEA:66132 as broad match to ATPase-coupled intramembrane lipid transporter activity and all its children.

Does that sound OK to you?

Thanks, Pascale

pgaudet commented 3 weeks ago

@raymond91125

You write

I tried using a python script (oaklib) to exhaustively search for RHEAs in definition but not in general xrefs. The script is incomplete but it did whittle down to 22 candidates. I manually checked them and found 3 valid cases that need attention.

What about the other 19?

raymond91125 commented 3 weeks ago

The other 19 are false positives that my incomplete script is not able to eliminate. So I had to manually do it. The reasons for the incompleteness are silly and technical.

  1. I used the GO 2024-09-08 release. Some terms have already been fixed since then.
  2. I couldn't figure out how to get all the general xrefs with oaklib.
raymond91125 commented 3 weeks ago

So: since RHEA does not assign which side is in out out, these mappings cannot be exact WRT to the directionality of flippases and floppases. Therefore, I suggest we add RHEA:66132 as broad match to ATPase-coupled intramembrane lipid transporter activity and all its children.

RHEA:66132 is specifically about one type of lipid phosphoethanolamine, not general enough, I think. According to PMID:20043909, flippases are P-type ATPases (I think therefore EC:7.6.2.1 xref GO:0140326) whereas floppases are ABC-type ATPases (possibly EC:7.6.2.-).

Yesterday, I did miss the important difference between RHEA:66132 and RHEA:36439, the side where ATP hydrolysis takes place. Therefore, I think RHEA:36440 was incorrectly applied to GO:0090555 phosphatidylethanolamine flippase activity. This is confirmed by UniProt https://www.uniprot.org/uniprotkb/P39524/entry, which annotates DRS2, the gene studied in GO:0090555 definition reference, with RHEA:66132.