geneontology / go-ontology

Source ontology files for the Gene Ontology
http://geneontology.org/page/download-ontology
Creative Commons Attribution 4.0 International
222 stars 40 forks source link

single gene product protein binding #9845

Closed gocentral closed 9 years ago

gocentral commented 12 years ago

I think there was a plan to remove all of the gene propduct specific protein binding terms? I spotted a few more:

(I am keen to get rid of these to preempt the request of lots of gene product specific protein biniding terms when we launch community curation)

MDM2 binding http://www.uniprot.org/uniprot/Q00987 Has isoforms but single loci, so a single ID can be used

MDM4 binding http://www.uniprot.org/uniprot/O15151 Has isoforms but single loci, so a single ID can be used

KU70 binding http://www.uniprot.org/uniprot/P12956 single loci

RPTP-like protein binding Definition Interacting selectively and non-covalently with proteins with similar structure/function to receptor protein tyrosine phosphatases. This one has a definition which isn't clear (how similar structure? how similar function?) only has a couple of experimental annotatiosn, the rest IEA, can this one go?

SMC protein binding Definition Interacting selectively and non-covalently with any protein from the structural maintenance of chromosomes (SMC) family, a group of chromosomal ATPases with a role in mitotic chromosome organization. This term is odd, because if you have enough info to know it is SMC, you should be able to say which family member ....they aren't that similar this one only has a single annotation

cystic fibrosis transmembrane conductance regulator binding http://www.uniprot.org/uniprot/P13569 single loci

Thanks

Val,

Reported by: ValWood

Original Ticket: geneontology/ontology-requests/9638

gocentral commented 12 years ago

also, could the terms which remain under here have a comment that they should only be used if the specific binding partner cannot be distinguished, otherwise the "protein binding " should be used with the identifier of the interacting gene product in the "with field"

I suspect there are more which are gene product specific but the definitions are so vague it isn't possible to know whether they refer to a specific protein, a protein with loci specific isoforems, or a protein family of quite heterogeneous proteins.....For some of these grouping terms, almost certainly if a curator the info the that a family member is being bound, it must be possible to specify the member because they are not so similar that they would be recognised by a single antibody.... val

Original comment by: ValWood

gocentral commented 12 years ago

Hi Val,

Yes, there is indeed an on-going plan to remove gene product-specific children of protein binding. The original proposal is outlined here:

http://wiki.geneontology.org/index.php/Protein\_Binding\_clean\_up

This was written by Jane, so I'll check with her about the action plan for the MDM2, MDM4 and KU70 terms (as you can see, those terms were highlighted in the proposal but action was still needed; feel free to add suggestions here if you wish).

The proposal also indicated that this is a work in progress and that the first bunch of terms were chosen simply because of few/zero annotations, but the list is by no means complete. Also, we are going to discuss the protein binding issue soon at an editors' meeting, possibly tomorrow. I've linked this SF ticket to the agenda.

As for the RPTP-like term, the proposal says: "Terms that have a relationship with a receptor activity have been excluded from the list below, as they are required for full description of a signalling pathway in GO." I'll check with Becky if she has any comments.

As for the SMC term "Attention has been paid to ensure that when these terms are obsoleted, a parent term is available that can capture enough information on the protein group the interactor belongs to. If a suitable, information-rich protein binding parent is not available, one has been suggested." so the plan might have been to call this e.g. SMF family protein binding but again I'll check with Jane

The cystic fibrosis term I think falls in the category of terms that still need to be looked at during a second pass. Will update this ticket with more info.

Thanks, Paola

Original comment by: paolaroncaglia

gocentral commented 12 years ago

Ok thanks for the update. It would be good to get the obviuosly out of scope ones obsoleted sooner rather than later, less work for curators later, if they inadvertently use them and then need to fix things. The obsolete notices of these obvious ones would also serve as a reminder that this is in progress and help people who missed this discussion.

Thanks

val

Original comment by: ValWood

gocentral commented 12 years ago

In the meantime, we could add definition comments stating that the terms are pending obsoletion, see e.g. the def. comment to GO:0043526 neuroprotection

Original comment by: paolaroncaglia

gocentral commented 12 years ago

I think that would be helpful if there is a reason why they can't go yet.. val

Original comment by: ValWood

gocentral commented 12 years ago

Hi Val, a quick update on this: we touched on the protein binding issue at our editors' call yesterday, but didn't have enough time to discuss a following pass at cleaning up the children terms. I'll discuss it with Jane and Becky at our (EBI) meeting on Tuesday, as they've both been quite busy with other meetings this week. Thanks, Paola

Original comment by: paolaroncaglia

gocentral commented 12 years ago

Hi Val - wrt the annotation comments for the protein binding terms, two things.

First, we think comments like this are probably best maintained outside of the ontology itself, otherwise we have to remember to manually add the comment to new terms, it's hard to change the guidance etc. For other comments that apply to multiple terms, QuickGO have implemented a system where they link out to specific documentation just for certain terms, e.g.

http://www.ebi.ac.uk/QuickGO/GTerm?id=GO:0001071

i don't know if that's somnething you can implement in your browser? To label the terms, they just say 'all descendants of X, Y and Z' or sometimes a reg exp e.g. all terms with 'viral*'

Second, we're not sure how we'll figure out right now which terms should have the comment, because there are some legitimate terms for binding to specific gps (e.g. where they are aprt of a biological process) and there are some we just haven't got to yet.

Original comment by: jl242

gocentral commented 12 years ago

And - I'll deal with obsoleting/renaming the terms in Val's request above as soon as I can. Paola

Original comment by: paolaroncaglia

gocentral commented 12 years ago

Original comment by: paolaroncaglia

gocentral commented 12 years ago

Hi Val,

Before I forget. Unfortunately we're not going to be able to do an extensive second pass at cleaning up children of protein binding any time soon. We won't add new dodgy ones though. If you spot glaringly inappropriate binding terms, please report them in a SF ticket.

Thanks, Paola

Original comment by: paolaroncaglia

gocentral commented 12 years ago

As part of these edits, I have:

Added new term: GO:0097371 MDM2/MDM4 family protein binding is_a GO:0005515 protein binding Def: Interacting selectively and non-covalently with any isoform of the MDM2/MDM4 protein family, comprising negative regulators of p53. Dbxrefs: GOC:vw, InterPro: IPR016495

Renamed GO:0043221 SMC protein binding as 'SMC family protein binding' Dbxrefs: GOC:vw, InterPro: IPR024704 Exact synonym: Structural maintenance of chromosomes family protein binding

More soon Paola

Original comment by: paolaroncaglia

gocentral commented 12 years ago

Obsoletion email sent today.

Original comment by: paolaroncaglia

gocentral commented 12 years ago

Hi Paola, Jane

Yes we plan to have "PomBase specific" comments, and even to filter terms we don't use so I guess we will have to do it this way.

Re the new terms you are proposing, do you mind not attributing the defs to me (I don't think we require the terms, for SMC family, if you have an experiment to say something is SMC binding, you should be able to say which gene product (they aren't all that similar).

Likewise for MDM2/4 (I don't even know if these are related?). My point was that MDM2 is single loci (with a number of isoforms), but the uniprot entry would cover for any one of them, so the with field could be used instead of the specific term.

I think the new terms are as bad (unnecessary) ;)

val

Original comment by: ValWood

gocentral commented 12 years ago

Hi Val - I've removed your initials from that term :-)

We just want to provide a bit of structure for this node beyond just 'protein binding', not so much for annotating cases where you don't know the individual protein bound (although this will be the case for very related proteins e.g. actins) but for things like enrichments and slims, and for annotating groups who don't use c16 who would otherwise only have an annotation to 'protein binding'. We've gone for binding to protein families from InterPro for now because it seemed like the best option. For this term MDM2/4 family binding, MDM2/4 are structurally and functionally related (duplication event) but in fact this family only comprises these two proteins so I agree it's not ideal!

Jane

Original comment by: jl242

gocentral commented 12 years ago

Actually, ignore what I said about c16 - the protein id goes in the WITH column for 'protein binding' of course!

Original comment by: jl242

gocentral commented 12 years ago

Obsoleted:

GO:0070215 MDM2 binding GO:0070216 MDM4 binding GO:0017170 KU70 binding GO:0042153 RPTP-like protein binding GO:0042980 cystic fibrosis transmembrane conductance regulator binding

Thanks Paola

Original comment by: paolaroncaglia

gocentral commented 12 years ago

Original comment by: paolaroncaglia