geneontology / go-ontology

Source ontology files for the Gene Ontology
http://geneontology.org/page/download-ontology
Creative Commons Attribution 4.0 International
220 stars 40 forks source link

inference of MF in GO-CAM based on logical definitions from Rhea #20096

Open balhoff opened 3 years ago

balhoff commented 3 years ago

We should evaluate the generated Rhea-based logical definition pattern for catalytic activities, as far as how much it supports automatic inference of membership for GO-CAM MF nodes. E.g.:

maleate hydratase activity == 
(catalytic activity 
and has input some ((R)-malate(2-) and has stoichiometry value “1”)
and has output some (maleate(2-) and has stoichiometry value “1”)
and has output some (water and has stoichiometry value “1”))
or
(catalytic activity 
and has output some ((R)-malate(2-) and has stoichiometry value “1”)
and has input some (maleate(2-) and has stoichiometry value “1”)
and has input some (water and has stoichiometry value “1”))

For example, the 'has stoichiometry' clause would currently never match for a GO-CAM node.

ukemi commented 3 years ago

If you ignore stoichiometry and only consider participants, how many equivalent classes are generated if you reason across all of the rhea direction-agnostic reactions, any?

amorgat commented 3 years ago

Hi Jim

We wonder if you had considered generating entirely new GO terms automatically from Rhea annotations in UniProt (and their logical definitions).

When we started using Rhea in UniProt we had about 5300 reactions - it's up to 8000 now and growing fast.

Many of the reactions we add have no GO terms associated at all but eventually these should appear.

To speed that up, why not map Rhea to GO by using Rhea to generate GO? (as well as logical definitions)?

You could mine UniProt each release for new reactions - we can help you there if that is of interest.

All the best, @alanbridge and Anne

cmungall commented 3 years ago

I think in practice a GO-CAM curator would never post-compose a reaction

@amorgat : yes, this is definitely part of the plan. We want to automate this. The parts where we would need a GO curator

  1. giving a name
  2. manual classification

IMHO for 1, I think that while users would prefer an EC style name, if we give a name that is the rhea reaction string "A + B = C +D" they won't actually care. It's more important we get the annotations

For 2, for some proportion, Rhea will provide a parent, and if we have equivalenced the parent it just slots in. For others:

This should probably go in its own ticket

pgaudet commented 3 years ago

@balhoff Should this be assigned to you ?