Open gocentral opened 10 years ago
Will search collective editors memories on next call.
Original comment by: tberardini
Original comment by: tberardini
Harold - do you remember?
Original comment by: tberardini
We did NOT agree to implement this officially as far as I can remember
I still don't see what the problem is with homodimerization activity as a type of protein binding where the property of the protein is that it binds itself.
Original comment by: hdrabkin
OK, we (PomBase) thought there was some action for these. We have them in our collection of terms not to use in annotation.
They do seem to be a problem though, because this is describing more than a binding activity (how does it differ from the parent 'protein self binding ?)
the def "Interacting selectively and non-covalently with an identical protein to form a homodimer" seems to be more about the subunit composition, rather than the activity....
So the current guidance is to keep using these terms?
v
Original comment by: ValWood
Maybe it was the process terms which were a problem?
http://wiki.geneontology.org/index.php/Annotation_Conf._Call_June_11,_2013
I think the discussion was that this practice resulted in the proteins which from oligomers also get annotated to the process 'oligomerization'
which is not really correct.
Original comment by: ValWood
Val; protein self-binding doesn't infer a stop at dimer Interacting selectively and non-covalently with a domain within the same polypeptide.
homodimer specifies a limit
Also remember that these terms are designating an activity by the protein all by itself. No caperones, etc.
Original comment by: hdrabkin
Hi
it is very useful to be able to state that a protein is a homo or heterodimer. Many proteins only function when in the homo or heterodimer state eg nuclear receptors, RXR and LXRs. In addition many receptors that homodimerise bind to homodimerised ligands. Furthermore some genes encode a variety of isoforms and therefore the homodimerisation or heterdimerisation state of these protein complexes is not straightforward.
For many protein complexes is is often just as important to be able to capture that a protein binds to itself (or a similar protein) as it is to capture that a protein binds an unrelated protein. If we remove these homo/heterodimerization activity terms then we are implying that some protein interactions are more important than others.
Ruth
Original comment by: RLovering
OK, so what about the processes "protein homotrimerization" etc. How do these differ from "protein complex assembly"
Original comment by: ValWood
Hi,
Why can't those be types of complexes? The fact that a protein is a multimer is neither a process or a function. Can we create CC terms 'dimer', homodimers, etc, and use that instead ? "protein homodimerization activity" is not very informative as a function.
Thanks,
Pascale
Original comment by: pgaudet
Hi Pascale
Please read my comments that I had included in the SF item
it is very useful to be able to state that a protein is a homo or heterodimer. Many proteins only function when in the homo or heterodimer state eg nuclear receptors, RXR and LXRs. In addition many receptors that homodimerise bind to homodimerised ligands. Furthermore some genes encode a variety of isoforms and therefore the homodimerisation or heterdimerisation state of these protein complexes is not straightforward.
For many protein complexes is is often just as important to be able to capture that a protein binds to itself (or a similar protein) as it is to capture that a protein binds an unrelated protein. If we remove these homo/heterodimerization activity terms then we are implying that some protein interactions are more important than others.
Ruth
From: Pascale Gaudet pgaudet@users.sf.net<mailto:pgaudet@users.sf.net> Reply-To: "[geneontology:ontology-requests]" 10909@ontology-requests.geneontology.p.re.sf.net<mailto:10909@ontology-requests.geneontology.p.re.sf.net> Date: Friday, 6 June 2014 12:04 To: "[geneontology:ontology-requests]" 10909@ontology-requests.geneontology.p.re.sf.net<mailto:10909@ontology-requests.geneontology.p.re.sf.net> Subject: [geneontology:ontology-requests] #10909 dimerization MF terms
Hi,
Why can't those be types of complexes? The fact that a protein is a multimer is neither a process or a function. Can we create CC terms 'dimer', homodimers, etc, and use that instead ? "protein homodimerization activity" is not very informative as a function.
Thanks,
Pascale
[ontology-requests:#10909]http://sourceforge.net/p/geneontology/ontology-requests/10909/ dimerization MF terms
Status: open Group: None Created: Wed Jun 04, 2014 10:30 AM UTC by Valerie Wood Last Updated: Fri Jun 06, 2014 10:28 AM UTC Owner: Harold J. Drabkin
Wasn't there a plan to obsolete these MF terms ? (they are basically terms describing subunit composition defned as molecular functions)
GO:0042803 protein homodimerization activity GO:0046982 protein heterodimerization activity GO:0046983 protein dimerization activity GO:0051260 protein homooligomerization
I'm sure this was actioned at one of the consortium meetings...
Sent from sourceforge.net because you indicated interest in https://sourceforge.net/p/geneontology/ontology-requests/10909/
To unsubscribe from further messages, please visit https://sourceforge.net/auth/subscriptions/
Original comment by: RLovering
Hi Ruth,
I meant to say, instead of using
Do you mean to say that doesn't capture what you need with respect to the multimeric status of the active form of the protein ?
Pascale
Original comment by: pgaudet
Hi Pascale
Would you rather annotate to MF receptor binding, or to CC receptor complex
Ruth
From: Pascale Gaudet pgaudet@users.sf.net<mailto:pgaudet@users.sf.net> Reply-To: "[geneontology:ontology-requests]" 10909@ontology-requests.geneontology.p.re.sf.net<mailto:10909@ontology-requests.geneontology.p.re.sf.net> Date: Friday, 6 June 2014 12:32 To: "[geneontology:ontology-requests]" 10909@ontology-requests.geneontology.p.re.sf.net<mailto:10909@ontology-requests.geneontology.p.re.sf.net> Subject: [geneontology:ontology-requests] #10909 dimerization MF terms
Hi Ruth,
I meant to say, instead of using
Do you mean to say that doesn't capture what you need with respect to the multimeric status of the active form of the protein ?
Pascale
[ontology-requests:#10909]http://sourceforge.net/p/geneontology/ontology-requests/10909/ dimerization MF terms
Status: open Group: None Created: Wed Jun 04, 2014 10:30 AM UTC by Valerie Wood Last Updated: Fri Jun 06, 2014 11:04 AM UTC Owner: Harold J. Drabkin
Wasn't there a plan to obsolete these MF terms ? (they are basically terms describing subunit composition defned as molecular functions)
GO:0042803 protein homodimerization activity GO:0046982 protein heterodimerization activity GO:0046983 protein dimerization activity GO:0051260 protein homooligomerization
I'm sure this was actioned at one of the consortium meetings...
Sent from sourceforge.net because you indicated interest in https://sourceforge.net/p/geneontology/ontology-requests/10909/
To unsubscribe from further messages, please visit https://sourceforge.net/auth/subscriptions/
Original comment by: RLovering
Hi Ruth,
I was talking about dimerization as a MF versus dimer as CC. I dont see why you couldnt have two annotations:
And that's not very different form how you would do it with a MF:
Are you talking about the same thing ?
Thanks,
Pascale
Original comment by: pgaudet
Hi Pascale
Can we just agree to differ on this one. The way binding is progressing Im not sure that these questions will be relevant in 6 months and I have a lot to do.
To conclude I just don't get why saying a protein binding a ligand is more important than saying it binds itself.
So I would suggest if you don't want to capture homo and heterodimerisation why capture ligand binding
The idea of MF is to try to suggest a functional role a protein has in biological process and also a functional role a protein has within a cellular component. Consequently for some proteins that role is to bind another protein which just happens to be a homo or hetero dimer interaction. The other option is that we would annotate
protein A protein binding Protein A-1 rather than protein A heterdimer activity protein A-1
Sorry to not agree with you
Ruth
From: Pascale Gaudet pgaudet@users.sf.net<mailto:pgaudet@users.sf.net> Reply-To: "[geneontology:ontology-requests]" 10909@ontology-requests.geneontology.p.re.sf.net<mailto:10909@ontology-requests.geneontology.p.re.sf.net> Date: Friday, 6 June 2014 13:53 To: "[geneontology:ontology-requests]" 10909@ontology-requests.geneontology.p.re.sf.net<mailto:10909@ontology-requests.geneontology.p.re.sf.net> Subject: [geneontology:ontology-requests] #10909 dimerization MF terms
Hi Ruth,
I was talking about dimerization as a MF versus dimer as CC. I dont see why you couldnt have two annotations:
And that's not very different form how you would do it with a MF:
Are you talking about the same thing ?
Thanks,
Pascale
[ontology-requests:#10909]http://sourceforge.net/p/geneontology/ontology-requests/10909/ dimerization MF terms
Status: open Group: None Created: Wed Jun 04, 2014 10:30 AM UTC by Valerie Wood Last Updated: Fri Jun 06, 2014 11:32 AM UTC Owner: Harold J. Drabkin
Wasn't there a plan to obsolete these MF terms ? (they are basically terms describing subunit composition defned as molecular functions)
GO:0042803 protein homodimerization activity GO:0046982 protein heterodimerization activity GO:0046983 protein dimerization activity GO:0051260 protein homooligomerization
I'm sure this was actioned at one of the consortium meetings...
Sent from sourceforge.net because you indicated interest in https://sourceforge.net/p/geneontology/ontology-requests/10909/
To unsubscribe from further messages, please visit https://sourceforge.net/auth/subscriptions/
Original comment by: RLovering
The homotrimerization is a specific type of complex assembly. Note that there could be other proteins, etc, involved in the process, and not an inherant property of the protein itself (ie, an activity).
Original comment by: hdrabkin
Yes there could be other proteins involved, but that is not how it has ever been used as far as I can tell (especially by any of the numerous IEA mappings). Is there any value in recording that this is a different process? (how would you differentiate
the process, apart from by subunit composition?). More valuable to have the specific complex which is being assembled. I am unconvinced that it has been used in this way.
If this is the case, to be consistent here, then all of the x subunit complexe assembly terms should move under the corresponding protein homooligomerization term.
Original comment by: ValWood
i.e is it useful to know that a protein is involved in "protein heterodimerization" if you don't know of what?
Most of the proteins annotated to these terms are clearly proteins which are themselves heterodimers etc....which seems a pretty good reason to obsolete them.... (and recommend reannotation to the MF term, or to protein assembly of the specific complex)
Original comment by: ValWood
Val: But only if they are homo-oligomers; not all protein complexes are homo anything.
Maybe we should ditch the process terms (so we only have complex assembley)?
Original comment by: hdrabkin
That would make sense to me. I thought we had come to that conclusion before. Maybe one to raise at the next GO meeting?
Val
Original comment by: ValWood
The original post, however was addressing these guys GO:0042803 protein homodimerization activity GO:0046982 protein heterodimerization activity GO:0046983 protein dimerization activity GO:0051260 protein homooligomerization
I would still opt to keep the homodimerication activity since it refers to the property of one protein
Original comment by: hdrabkin
it was a mixture... also included GO:0051260 protein homooligomerization I was querying all the function terms and the process terms which we have automated mappings to.
I think the case for getting rid of the processes is clearer and we should proceed with that if possible. I'm not convinced that the function terms are in scope as MF curation. However, I agree it is useful to collect subunit composition data, and so the justification for keeping these is stronger, if resources do not have an alternative mechanism to record this information.
v
Original comment by: ValWood
Val, I would also be happy to have "homodimer" in CC, especially when I have to make PRO ids for various homodimers as complexes, it would help to be able to make an "is_a" to a GO id for that
Original comment by: hdrabkin
I didn't know this one was going to be a can of worms. Will raise at the next meeting to see what the best solution is.
Val
Original comment by: ValWood
Original comment by: ValWood
Hi Val
if you could get a resolution on this issue at the GOC that would be great. I agree that having a BP term for protein complex assembly (and removing the BP oligomerization terms) may enable more consistent annotation to this term, with the proteins facilitating the assembly annotated to this term. Then having the MF terms perhaps expanded to include oligomerization activity etc with the aim that terms such as heterodimerization activity are of limited use without an ID in the with field. Not sure about the CC terms!
Best
Ruth
Original comment by: RLovering
I'll put it on the agenda later today...I have a list ;)
Original comment by: ValWood
Paola has already added with a link to http://sourceforge.net/p/geneontology/ontology-requests/11087/
Original comment by: ValWood
We need to make a decision on these terms. GO-CAM models suggest they are not useful.
Hi my understanding of these terms was contradicted in the recent GOC meeting. But looking at the comments listed above I do feel that my interpretation of how to apply these annotations was correct.
The MF terms for dimerization have been applied to a protein subunit to indicate that that subunit binds to an identical or nonidentical subunits (eg GO:0042803 homodimerization: definition Interacting selectively and non-covalently with an identical protein to form a homodimer.) When the homodimer term is applied the WITH field would include the same protein ID as the protein ID annotated. The comments above by Harold/Val/Pascale. These MF terms should not be applied to scaffold proteins (for eg) that facilitate the dimerization of subunits. Or maybe I misinterpreted the comments in the meeting that these terms were not applied to the proteins dimerising. I commented previously (June 2014) about how useful these terms are, and with the application of GO to describe druggable targets the homodimerization information is likely to be useful.
Whereas the BP terms. such as protein complex assembly, are applied to proteins that bring proteins together and therefore have a role in assembly of oligomers. These terms are often incorrectly applied to proteins that are the targets of the process (ie the proteins that make up the oligomer) rather than scaffold proteins (for eg).
I still think this implies that it is more important to capture that a protein binds a different protein than it is to capture that it binds an identical or similar protein. As I have mentioned on many occassions we now have a pipeline that is exporting GO PPI data to PSICQUIC which enables the PPI data we capture to be included in network analysis. If you remove the MF dimer terms then I would hope that these interactions will be captured using the parent protein binding term rather than all of these annotations being deleted. If new CC terms are created then I guess the annotations could be revised automatically from homodimerization activity to homodimer (etc) with the same evidence codes? But this then leaves the question about all the other MF binding annotations. The curator will need to decide if they are going to make an MF annotation to capture protein interaction data, or a CC annotation to capture this data. So there will be a reduction in the consistency of binding data annotation. Although I also appreciate that this system does not support the annotation of complexes made up of multiple identical proteins, there again the term identical protein binding can be used for these (or not if this term is removed too).
I hope that this is not the start of a move to remove all protein interaction data from GO.
Please make a clear statement about whether you are referring to BP or MF terms, and what replacement terms will be suggested, (if any) and what will happen to the existing annotations.
Thanks
Ruth
Hi
I have had a chat to Sandra and we both agree that homodimer is useful information for drug development. However there is a problem with the application of this term because we have been encouraged to use this term for trimers, as the trimer terms were not created and so we were encourage to assume that 'it must have formed a homodimer before it formed at trimer'. But this means that homodimer will have been applied to trimers so a more accurate interpretation of these annotations would be to say 'identical protein binding'. So perhaps the homodimer terms could be moved up to identical protein binding. I still think heterodimer is useful but at the same time probably these should use be revised to more useful complex statements
Ruth
Hi Ruth,
How about using GO:005515 protein binding with the same protein, instead of protein homodimerization activity? (same for dimerization activity, we could just specify the partner).
Thanks, Pascale
Hi
I am fine with MF: homodimerization being CC homodimer.
However, can we keep MF: GO:0042802 identical protein binding
Ruth
Pascale's suggestion to just annotate to "protein binding" and put the same protein in the enabled_by slot and has_input slot makes sense, and would also work in GO-CAM. All pairwise protein-protein interactions would be treated the same way, then, whether they are between two different proteins, or two molecules of the same protein.
If there's interest from users in retrieving the set of all proteins that bind another protein of the same type, we can do that as a SPARQL query. I don't think the set of "all proteins that form homomultimers" is very useful for enrichment use cases, but if we change our minds later we can just add back the class and populate it with the SPARQL query.
sounds like you have a plan, the only group that seems to have found a use for these annotations published in 2009 https://pubmed.ncbi.nlm.nih.gov/19640831/ hopefully bioinformatics research has moved on from this level of analysis.
I guess I would still prefer to have the term identical protein binding rather than just protein binding as at least this makes a statement that can be propagated to orthologous proteins. An annotation to protein binding with the same protein in the has_input slot will not get propagated as the has_input ID would have to change for each different species.
Ruth
I dont see a problem keeping 'identical protein binding' - I guess we'd need a check on the 'with' to make sure people don't use 'protein binding' - the point being, if we dont use the term consistently, it's less valuable.
I guess I would still prefer to have the term identical protein binding rather than just protein binding as at least this makes a statement that can be propagated to orthologous proteins.
but we don't propagate 'protein binding' do we?
IntAct only outputs "protein binding" irrespective of whether it's to another protein, an identical protein or itself. We make that distinction in IntAct but the GAF script ignores it.
PS: the binding partner is in the with/from field.
I still think a query is the best way to receive this information. Especially since most annotation groups will not specify, a query would be comprehensive for existing annotation, but using the term would only give a partial dataset.
This is the sort of information (how to retrieve self binding proteins) that could be in the FAQ, which was once started, but abandoned? This could also be pointed to in answer to the increasingly frequent twitter storms. People don't come to GO for help, they pan GO on twitter. https://twitter.com/KathrynCrouch81/status/1358716940429254656
Ah the FAQ is still active and quite extensive http://geneontology.org/docs/faq/ It would be useful if outreach could point the twitteratia to GO answers in the FAQ as and when questions arise.
That would have the advantage of bringing lots of people to the info.
Hi Val
for human there are nearly 20,000 annotations based on ISS, IBA or IEA evidence for child terms of protein binding, https://www.ebi.ac.uk/QuickGO/annotations?goUsage=descendants&goUsageRelationships=is_a,part_of,occurs_in&goId=GO:0005515&evidenceCode=ECO:0000250,ECO:0000247,ECO:0000266,ECO:0000318,ECO:0000319,ECO:0000501&evidenceCodeUsage=descendants&taxonId=9606&taxonUsage=descendants
Of which over 500 proteins are associated with the homodimer term and 716 proteins are associated with identical protien binding.
For human proteins with manual evidence annotations (over 250,000) https://www.ebi.ac.uk/QuickGO/annotations?goUsage=descendants&goUsageRelationships=is_a,part_of,occurs_in&goId=GO:0005515&taxonId=9606&taxonUsage=descendants&evidenceCode=ECO:0000352&evidenceCodeUsage=descendants
The identical protein binding term is associated with 1,503 proteins.
I realise this has no value in enrichment analysis, I just wonder if this has some value for other research projects - eg for drug development it would be useful to know that the target dimerises or multimerises. Although I also appreciate that there are other ways of identifying proteins with this capacity. Just seems like a lot of data to dump.
Ruth
About documentation/FAQ, we have some text from our Jan 2018 NAR paper (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5210579/) that could be a place to start for text about direct annotations to protein binding: Protein binding annotations are only useful if they include the specific protein binding partner. With the addition of the IntAct database (10) as a GO annotation provider, the number of specific protein binding annotations has increased dramatically (Table (Table2,2, first column). Only high-confidence annotations are incorporated into GO from IntAct. Combined with annotations from hypothesis-driven, small-scale experiments that have been contributed to GO from multiple different annotation providers, IntAct annotations help make the GO knowledgebase a useful resource for high-confidence protein interaction network data. To create protein interaction networks, users need to utilize the ‘with’ field (column 8) of the GO Association Files (GAF), which contains the identifier of the interacting partner.
Ideally, we'll have the binding partner in the has_input extension in the near future.
Wasn't there a plan to obsolete these MF terms ? (they are basically terms describing subunit composition defned as molecular functions)
GO:0042803 protein homodimerization activity
GO:0046982 protein heterodimerization activity GO:0046983 protein dimerization activity
GO:0051260 protein homooligomerization
I'm sure this was actioned at one of the consortium meetings...
Reported by: ValWood
Original Ticket: geneontology/ontology-requests/10909