Closed bmeldal closed 7 months ago
Initial thoughts from breakout group:
GO-CAM models of different types of complexes and associated functions of their respective members are here:
http://noctua.berkeleybop.org/editor/graph/gomodel:59c8885900000281
These are proposed models for different types of use cases, but we will likely want to model more before coming to any decisions.
Note that the rules for GPAD annotation outputs have not been formulated yet, so what is currently in the annotation preview is not final.
related to https://github.com/geneontology/go-annotation/issues/1661
Collecting use cases here: https://docs.google.com/document/d/1ZtAcjIyIQ_ycbuMHyvLA-KIJQtGenh82lxS-MKC6a_A/edit#
After discussing this for a little while yesterday (30/11/17) we decided it would be good to get user input to what they need/want.
I've tried to summarise the situation regarding the way we extract annotations from complexes and from individual GPs:
Along the black lines, the top relationships are those used currently, the ones added in red pen are the ones we are discussing.
Points:
Please review the picture and post your MF class related comments here.
Please post CC class related comments in #1639
https://github.com/geneontology/go-ontology/issues/14847 an example of where contributes_to was discussed as a possible relationship
Call from 13/2:
We need to have examples of a complex: 1: 1 catalytic subunit + other subunits --> Val wouldn't annotate to non-catalytic subunits 2: complex that requires all members to have catalytic activity --> "contributes_to" - but was tricky in it's usage. https://github.com/geneontology/go-annotation/issues/1650
@ValWood @RLovering @ukemi @sylvainpoux @vanaukenk You've all had comments on the subject, please add them if not already captured.
Collecting use cases here: https://docs.google.com/document/d/1ZtAcjIyIQ_ycbuMHyvLA-KIJQtGenh82lxS-MKC6a_A/edit# (that's the link Kimberly shared at the call yesterday)
Draft survey: https://docs.google.com/document/d/1P_VLM9g13kj9lu3CRAgotAI3cUmWkS1yWaVg95u2Vbk/edit?usp=sharing
Thanks.
Minutes from WG call on 22/2/18:
Present: @judyblake , Li (sorry, don't have GH ID), @hdrabkin , @sandraorchard , @ValWood , @NancyCampbell , @tberardini , @RLovering , @deustp01 , @vanaukenk , @nataled , @ggeorghiou , Pascale, @edwong57 , @bmeldal (I hope I haven't forgotten anyone! - in no order!)
Sides: 2018_02_22_inferring_GOannotations_from_complex_to_GPs.pptx [link updated on 27/2/18, had the wrong file!]
Complex Names: Harold: Complex names can be very long and cumbersome which makes them hard to search for, can we find short forms? Sandra: There are short labels in the DB but we don't display them. Birgit: If a shorter form exists it's in the synonyms which are in the search index. Also can search with gene symbols (which make up the systematic name).
Use of contributes_to : also https://github.com/geneontology/go-annotation/issues/1650 - ticket for guidelines proposal for contributes_to We spent most of the call on this!
$ regulatory subunits: any GP (protein or otherwise) that has not been identified to be carrying out the enzymatic activity of the complex but are consistently found as complex member. They may or may not be essential (we don't distinguish essential subunits in the CP as most experiments don't go into that detail consistently).
only if >1 protein is required for the function. If function shown experimentally for the protein in isolation --> direct annotation to MF
Harold: Maybe need regulates for the regulatory subunits as a new term?
only where catalytic subunit not identified. Then add qualifier to all proteins of the complex.
Nancy's example of Telomerase: direct annotation to TERT enzyme subunit and contributes_to for telomeric RNA component
Ruth: What about homodimers? Annotate directly. Discussion highlighted issue that we can never know if the function is carried out by the monomer or homodimer (or even homomultimer) if protein selfassembles in solution. AI: Birgit to add PDGF examples
Summary: Different groups use slightly different guidelines (and it may even vary within groups) either annotating all regulatory subunits of a complex with contributes_to or only in cases where the catalytic subunit has not been identified. Solution: Draw up new annotation guidelines (https://github.com/geneontology/go-annotation/issues/1650) and revisit all annotations. Birgit: to provide a list of GPs that have NOT enzyme as biological role in complexes in CP as a guide (list won't be comprehensive as DB has not got full coverage yet!).
Rational WHY I want to do this:
Summary from GOC meeting in Cambridge (Oct 2017): (AE suggestion was Kimberly's)
Val: Annotations on Gene Pages link MFs to complexes using occurs_in Birgit: Are the MF and CC annotations connected? If not we have a list of functions and a list of complexes but no link. Can we have an example (screenshot) please, @ValWood ?
Ruth (initial gut feeling): export for catalytic subunits but not regulatory subunits. Pascale/Kimberly: make the distinction between catalytic and regulatory subunits Ruth: is there a clear line between what is a catalytic subunit?
Birgit: Who would use these annotations??? What do they really need??? Judy: may know some power users that may be able to make use of these complex annotations.
NO SOLUTION YET!!! Options:
Going forward:
@judyblake (@hdrabkin /@deustp01 ) to pass on details of users to Birgit, Birgit to get some feedback before next call (8/3/18).
Birgit
(I've probably forgotten something or someone so please add your comments. I'll be updating the contributes_to guideline ticket later.)
Ok, forgot one thing:
"X binding":
So far we have discussed what to do with catalytic activity but we also have MF annotations to "binding". We don't use "protein/complex" binding, that sort of data is captured by IntAct and exported from there, but any other type of "binding", e.g.:
Caveat: We don't know which subunit binds the target, unfortunately, we haven't captured that yet (but I just got an idea how I could do it so I could go back and add it in if we want it!). [Note to self: either by using the reference column with pipe or adding with/from as new field to our editor - which would be helpful anyway for creating our files.]
"Homework" for everyone:
Think about binding terms that the user might want/need. @RLovering can you think about this in the context of GREEKC, please?
Not discussed on the call:
but I'll run the issues by the users as well when I speak to them.
Example for homodimers: https://www.ebi.ac.uk/complexportal/complex/EBI-2881436 Platelet-derived growth factor AA complex Homodimer of P04085 PDGF subunit A Only exists as dimer and functions a ligand for PDGF receptors I haven't looked for the experiments for the activity as the complex evidence came from a crystal but from memory PDGF ligands are well described as obligate dimers.
PDGF ligands come in 5 flavours: AA, AB, BB, CC, DD.
And, the receptors (alpha-alpha, beta-beta or alpha-beta) don't dimerise until the ligand complex binds, forming an obligate heterotetrameric receptor-ligand complex!
So, the activities rely both on the dimeric ligand and the tetrameric receptor-ligand.
Food for thought how you would annotate that!
Added to google doc as well.
Google sheet for list of GPs as participants of catalytic complexes and their biol role annotations in the CP: https://docs.google.com/spreadsheets/d/1-9PdAJ8BvrjhPWLx5pB9N0rS6hDNgYifO4_3CmbplEo/edit?usp=sharing
Summary from call on 15/3/18:
Present: Lauren-Philip, Pascale, Peter, Edith, Kimberly, Harold, Val, Tanya, Ruth, Sandra, Birgit
Lauren-Philip has a protein-protein interaction background and introduced his interest in complexes as drug targets.
We looked through the Google sheet that contains all GPs from complexes annotated to "catalytic activity" and split by their biological roles: enzyme, enzyme regulator and unspecified role.
Points to consider:
Decision 1: Infer MF only to GPs annotated with biological role=enzyme!
Thought: can we capture the "regulator activity" by walking through the ontology?
E.g. CPX-1001(EBI-13638510) Calcineurin-Calmodulin complex, gamma-R1 variant has GO:0033192 calmodulin-dependent protein phosphatase activity. Its parent GO:0004723 calcium-dependent protein serine/threonine phosphatase activity has a FUNCTION regulates child GO:0008597 calcium-dependent protein serine/threonine phosphatase regulator activity which would be applicable for the complex's 2 regulatory subunits, Calmodulin and Calcineurin.
Decision 2: Try and infer MF to GPs annotated with biological role=enzyme regulator to the class of regulator activity
Decision 3: At the moment we can't infer these as we didn't annotate the X binding property to the specific complex member but only the complex. Ruth: Could MODs annotates missing binding evidences from papers CP curators find as part of their curation? - add to GOC mtg agenda for Complex topic
Thank you all for your valuable input!!! I have something to work on now (well, Noe and Tony :) )
Update: Rather than sending papers with missing GP annotation evidence to MODs we could add those directly through P2GO --> AI: Complex curators to be trained in P2GO!
Update from NYU GOC mtg:
What do we do if there are 2 or more enzyme subunits and 2 or more function annotations on one complex? A script doesn't know which subunit has which function. --> Manually annotate in P2GO?
This translation is now done. GO MFs are NOT inferred from Intact/Complex portal annotations, only BPs.
Follow up from GOC meeting in Cambridge 2017:
The working group agreed that the CC (but see #1639 with regards to the required qualifiers!) and BP annotations would be equally appropriate for the complex subunits as they are for the complex so we'll extract them in the new Complex Portal GDAP.
However, extrapolating to MF is tricky:
We decided that we could infer the activity of the active subunit if it has been annotated as 'enzyme' in the CP. However, we could we also extrapolate something about the in-active complex members. In the past people used contributes_to but it has been proposed that this relationship should be obsolete or at least used with great care (#1650 - proposal guidelines). Can we say anything else about the MF of an inactive complex member?
Please post your use cases and proposals here.