geneontology / go-ontology

Source ontology files for the Gene Ontology
http://geneontology.org/page/download-ontology
Creative Commons Attribution 4.0 International
223 stars 40 forks source link

protein binding -- enzyme binding -- kinase binding family relations #14601

Open BarbaraCzub opened 7 years ago

BarbaraCzub commented 7 years ago

There appear to be at least two 'kinase binding' terms in the ontology, which should be a child of a more specific parent than they currently are (screenshot below).

screen shot 2017-11-20 at 16 35 00

cc @RLovering @paolaroncaglia @pgaudet @vanaukenk

pgaudet commented 7 years ago

I agree !

RLovering commented 7 years ago

Shouldn't the revision of the ontology be inferred computationally? I thought that was the point of some of David OS's work.

A decision is required for complexes that have catalytic activity, as not all of the subunits have this role. PKC also has regulatory subunits but PKC binding is placed in the ontology as a sibling to PKB binding (which is a single protein). It could be argued that the whole protein complex is the enzyme and that therefore PKC binding should be a child of enzyme binding and that is the correct placement. Or it could be argued that complexes with a catalytic part and with the other subunits being regulatory, not enzymes, cannot be child term of enzyme binding.

Also, if there are no more protein specific binding terms being created perhaps what Barbara has suggested is the easiest way forwards.

@bmeldal

BarbaraCzub commented 7 years ago

Thanks for your comment @RLovering

You are right that some relations do get inferred computationally, but not all. There need to be (a) logical definition(s) associated with a GO term in order for inference(s) to occur. Otherwise, relations are created by editors manually.

I have just had a look in Protégé, and it appears that there is/are (a) logical definition(s) associated with 'protein kinase B (PKB) binding', but not with 'protein kinase A/C (PKA/PKC) binding'.

See screenshot below: the 3 lines inside the circle next to 'PKB binding' indicate logical definition(s). 'PKA/PKC binding' terms lack these (the circles have no lines inside), so these two terms were placed in the ontology manually.

screen shot 2017-11-23 at 11 00 06

However, before I suggest that logical definitions should be added for 'PKA/PKC binding' terms, and the currently asserted relations removed, the single protein vs. protein complex aspect needs to be addressed.

How to organise enzyme (binding) terms, where the enzyme is a protein complex, rather than a single gene product?

Here you mentioned that 'PKA binding' was placed correctly under 'protein binding' (rather than 'enzyme/kinase binding) because PKA comprises non-catalytic subunits. So does PKC, but it was nonetheless placed under 'kinase binding'.

Personally, I am in favour of the PKC placement, because even though the protein complex consists of catalytic as well as non-catalytic regulatory subunits, overall, this protein complex is still an enzyme (and would probably not be active without the non-catalytic subunits). Also while searching for an appropriate GO term, as a curator, I would expect to find both 'PKA/PKC binding' under 'protein kinase binding'.

@pgaudet @vanaukenk Could we perhaps discuss this during a call?

bmeldal commented 7 years ago

Not sure I can answer it as we never annotate complexes with protein X binding terms - unless it's a whole family, e.g. GO:0001664 G-protein coupled receptor binding. In that case, it's really complex X-type family binding. Protein X binding can't distinguish whether the GP that's being annotated is a member of a complex with Protein X or whether it simply makes a molecular interaction, as transient as it might be?

BarbaraCzub commented 7 years ago

Thanks @bmeldal for commenting. I am not sure about the other two, but the term 'protein kinase A (PKA)' does indeed refer to a family of protein complexes (e.g. 1, 2).

So in this context the term 'PKA binding' should technically be a child of 'GO:0032403 protein complex binding' (and not [single] 'GO:0005515 protein binding'). But at the same time I believe that it should also be a child of 'GO:0019900 kinase binding' (which is a child of 'GO:0005515 protein binding').

RLovering commented 7 years ago

How about a new term: GO:0032403 protein complex binding

new term: kinase holoenzyme binding

protein kinase C holoenzyme binding (possibly add holoenzyme to these 3 term names, or just have as an alias?) phosphatidylinositol 3-kinase holoenzyme binding protein kinase A holoenzyme binding

is_a protein kinase A regulatory subunit binding is_a protein kinase A catalytic subunit binding

protein kinase binding

is_a protein kinase A catalytic subunit binding

Although would this mean adding a new CC term?: 'kinase holoenzyme complex'

Although this might be quite useful for grouping all the kinase complexes?

Ruth

bmeldal commented 7 years ago

new term: kinase holoenzyme binding

I prefer new term: “kinase complex” as “holoenzyme” makes me think the children have overlapping participants or at least are members of the same family which they wouldn’t be in this case (but they do for eg GO:000037 cyclin-dependent protein kinase holoenzyme complex). What you are looking for is the CC for GO:0016301 kinase activity.

RLovering commented 7 years ago

ok with me, but holoenzyme is used for all of the kinase complexes listed above ;)

so new terms: MF kinase complex binding, and CC kinase complex?

Ruth

pgaudet commented 7 years ago

Hi @RLovering I am not sure what your solution brings that the existing structure is lacking. We have PKA binding, and then PKA catalytic subunit binding; you want a grouping term for all proteins that bind complexes that have at least one kinase ?

It seems to me this should be a query on the annotations (using the MF annotations); I wouldn't complicate the ontology for that.

Thanks, Pascale

RLovering commented 7 years ago

Hi Pascale

rather than having lots of individual kinases listed under GO:0032403 protein complex binding, I thought it might be better to have a kinase complex term, so that this is domain is more of an ontology rather than a flat list of complexes bound. There are already more than 40 child terms for this term. It is difficult searching down lists like this.

I hadn't really thought about how big the complex might be! I was just thinking of complexes such as PKA, PKC and PI3K, where it is not possible to know if the protein binds to the regulatory or catalytic subunits.

There again this whole aspect could explode, for example I was just thinking it would be nice to have a receptor complex binding term to group terms like insulin receptor binding, then I realised that this term has protein binding and protein complex binding parent terms. So I guess this needs to be revised, I think a molecule is either a protein or a protein complex.

Ruth

bmeldal commented 7 years ago

2 comments/thoughts:

I vaguely remember having a conversation with @paolaroncaglia about "protein X-type family binding" terms and the fear of exploding the ontology so I never requested such terms (I use them if the already exists). I'm sure there are some top level enzyme families that could be grouped, such as the kinases.

I'm sure there are many situations where the experiment shows protein x-type family binding but you won't know if the GP is binding only one other protein or a complex. Maybe the terms "protein binding" and "complex binding" should be the merged and the defs contain "binding to a protein or protein complex containing a X-type family members"?

Just a thought - I've probably missed something crucial!

paolaroncaglia commented 7 years ago

@bmeldal I’m reading this very quickly. My recollection is that "protein X-type family binding" terms are not considered very informative because within a family of proteins there could be several different domains that the ligand binds to. I think we had banned such terms from e.g. TermGenie freeform requests, but I don’t remember where we stored documentation on this (can’t find it easily on the wiki) so can’t check for correctness. Other editors may recall better and confirm.

BarbaraCzub commented 5 years ago

Hello, @RLovering and I have just been discussing this and we would like to suggest that 'protein kinase A binding' should have the parent 'protein-containing complex binding'. At the same time, it should not have the child 'protein kinase A catalytic subunit binding'. But perhaps this could be a part_of relation between them? @bmeldal is this something, which could perhaps be disucssed during the call on Thursday?

bmeldal commented 5 years ago

Yes, I'm linking it to https://github.com/geneontology/go-ontology/issues/16833

bmeldal commented 5 years ago

From WG call in 17/1/19:

--> break relationship between protein binding and complex binding - annotations should be either to protein or complex --> check ontology for other relationships like these

BarbaraCzub commented 5 years ago

Based on 17 Jan Complexes Working Group discussions, I will:

cc @RLovering @bmeldal

ValWood commented 5 years ago

I really wish we didn't have gene product or family specific binding terms, it's impossible to use these consistently...

So now if I understand correctly "protein kinase B binding" will be renamed to a complex term, but "protein kinase C" and "protein kinase A" will stay under binding. This is all very difficult to follow and annotate consistently.

Why not just gene A "protein binding" gene B?

I know, I ask this all the time, but are there really so many cases where you don't know the specific gene products that are truly useful when evaluated against how much work it would be to make this right?

Even if I worked at it full time I could not possibly consistently annotate all of our existing annotations to "protein binding" that would warrant "protein kinase binding". Multiple this by all of the other type specific binding it's an NP complete problem!

bmeldal commented 5 years ago

Confirm again whether to break the relationship between 'protein binding' and 'complex binding', or whether this should perhaps be part_of and not is_a.

The only relationship we could have is "protein-containing complex binding has_part protein binding" but I'm not sure that is always true as a complex could be a protein binding to a small molecule or nucleic acid.

ValWood commented 5 years ago

Most have something in the "with" field.

Only 148 protein kinase binding annotations do not have something in the "with" field.

I really don't get why we need ALL these terms (unless they represent an activity).

At least, I don't understand the criteria for the inclusion of "protein kinase binding"

RLovering commented 5 years ago

Hi Val

I don't think this is the right ticket for the discussion about whether or not to keep protein family binding terms. I like them. But I am not going to discuss them here. This is a decision for the consortium to make. Please note that you have written all the wrong letters in your email above. protein kinase A is a complex Protein kinase B and C are not complexes. The discussion yesterday did however recognise that consistency is needed and that 'proteins' that are complexes should be listed under protein complex binding not protein binding.

Thanks Birgit for summarising Barbara will submit the agreed revisions

Ruth

bmeldal commented 5 years ago

Thank you, Ruth.

@ValWood Can you make a new ticket for the general discussion about using gene-specific pre-composed terms and link it to https://github.com/geneontology/go-ontology/issues/16833 (which discusses x complex binding only)?

ValWood commented 5 years ago

There is already at least one ticket for gene product and gene family specific protein binding terms: https://github.com/geneontology/go-ontology/issues/16186

I am pretty sure that it was decided long ago at a consortium meeting that these terms would be phased out. It is so long ago I don't even remember which meeting....but I really thought that was the long term plan...

ukemi commented 5 years ago

Washington DC (from the notes) David H on eliminating direct annotations to ‘protein binding’ Decision: protein family binding terms will be removed from GO. Annotations will be to ‘protein binding’ and the ‘with’ column (or, preferably, col. 16 using the has_input relation) will be required. If desired, an external protein classification can be used to create virtual protein family binding terms for grouping gene products for, e.g. enrichment analysis

RLovering commented 5 years ago

oops amazing how easy it is to forget all these decisions, good job there are minutes ;) Will we be required to make all these edits ourselves? Ruth

ValWood commented 5 years ago

Wow so I didn't imagine it ;)

bmeldal commented 5 years ago

I’m happy to disallow any x family/protein/complex binding terms but why was it agreed (and when) to use the x complex binding terms as discussed here #16833?

srengel commented 5 years ago

i have same question as @bmeldal . goes back to my growing discomfort with the complicated sets of rules to remember and the diverging strategies we are putting in place. we are weaving a messy web here that makes annotation compliance more and more difficult.

ValWood commented 5 years ago

I'm more comfortable with precomposed Complex specific binding than i) gene product specific (which were always historically discouraged) or "gene family-specific binding' which are not a natural grouping to describe functions or processes because they are heterogeneous sets.

I would be totally happy to post-compose these with extensions instead. However, I'd really like to do them all the same way and obsolete the existing ones if we decide to do this. It's probably a good plan to post-compose as there will be so many. What was the rationale for deciding to keep them on the complex call?

bmeldal commented 5 years ago

We didn't manage to get to this ticket, Val. Just discussed the use of very specific x complex binding terms from https://github.com/geneontology/go-ontology/issues/16833

bmeldal commented 5 years ago

see https://github.com/geneontology/go-ontology/issues/16833 for 24/1/19 call results.

bmeldal commented 5 years ago

Following the GOC mtg in Cambridge on 11/4/19, I think this ticket is no longer complex-related as we decided to remove "x complex binding" terms (complex binding rule implementation ticket).

Removing complex-related labels.