geneontology / go-ontology

Source ontology files for the Gene Ontology
http://geneontology.org/page/download-ontology
Creative Commons Attribution 4.0 International
220 stars 40 forks source link

GO:0140030 modification-dependent protein binding #21572

Closed hattrill closed 3 years ago

hattrill commented 3 years ago

I think that it would be good to review the use of these terms:

GO:0072572 poly-ADP-D-ribose binding GO:0072571 mono-ADP-D-ribose binding GO:0072570 ADP-D-ribose binding

Screenshot 2021-06-08 at 12 26 19

I think that the majority would apply to the binding of proteins PTM'd by ADP-D-ribose(s) rather than the free unit. PMID:20088964

An example are Ub ligases that target ADP-ribosylated proteins. e.g. http://www.ebi.ac.uk/interpro/entry/InterPro/IPR033509/ http://www.pantree.org/node/annotationNode.jsp?id=PTN001002511

hattrill commented 3 years ago

directly relates to #12787

ValWood commented 3 years ago

Should we even have terms for 'modification -dependent protein binding'? We have 2 ways to capture this. One with a term, and one with a modified form. Would it be better if we always did it one way?

hattrill commented 3 years ago

As these are binding a motif + mod, it's a bit more expansive than a one:one relationship that is captured by mod protein binding + PRO id 1:1. I think having higher order classes, such as 'glycosylation-dependent protein binding' is also great when there is a good chance that you don't know the exact nature of the mod or even where it is.

ValWood commented 3 years ago

As these are binding a motif + mod, it's a bit more expansive than a one:one relationship that is captured by mod protein binding + PRO id 1:1.

I didn't get this part

I think having higher order classes, such as 'glycosylation-dependent protein binding' is also great when there is a good chance that you don't know the exact nature of the mod or even where it is.

Yes that's a good point. But I think (although am not 100% sure) that you can have a generic modified form from PRO.

hattrill commented 3 years ago

What I mean is that the ADP-ribs. motif may reside in any number of different proteins (https://en.wikipedia.org/wiki/ADP-ribosylation). Querying based on this term would have use cases.

We've been here and dicussed this before: #12787 and these grouping term were deemed useful. Looks to me as if the ADP-D-ribose binding terms were missed in this review.

raymond91125 commented 3 years ago

Existing annotations (experimental evidence) Term | GP | Evidence (PMID) | binding to (ribose / ribosylated protein) GO:0072570 | ZFIN:trpm2 | IDA (PMID:31844070) | ADP-ribose GO:0072570 | UniProt:PARP9 | IMP (PMID:26479788) | poly ADP-ribose GO:0072570 | UniProt:PARP9 | IMP (PMID:28525742) | poly ADP-ribose GO:0072571 | UniProt:trpm2 | IDA (PMID:30250252) | mono ADP-ribose GO:0072571 | UniProt:TRPM2 | IDA (PMID:30467180) | mono ADP-ribose GO:0072572 | FlyBase:CG1218 | IDA (PMID:20088964) | poly ADP-ribose GO:0072572 | UniProt:RNF146 | IDA (PMID:21478859) | poly ADP-ribose and ribosylated protein GO:0072572 | dictyBase:DDB0192167 | IDA (PMID:27587838) | poly ADP-ribose

It seems so far the use are consistent with the definitions of these 3 terms. Binding of GP to ADP-ribosylated proteins are probably primarily directly through the ADP-ribose moiety, which is a rather large molecule. In some cases the poly ADP-ribose is thought to act as an adaptor or scaffold for multiple proteins. A good review is PMID:26673700 "Readers of poly(ADP-ribose): designed to be fit for purpose".

I think strictly speaking, what PMID:20088964 showed was that a poly-ADP-ribose (pADPr)-binding zinc-finger (PBZ) domain protein binds to 'free' ribose, based on NMR studies. Though the biological function may be ribosylated protein binding for the fly protein CG1218.

It makes sense to try to link ribosylated protein binding to ribose binding. We could

  1. Add ribosylated protein binding in the definition of ADP-D-ribose binding. OR
  2. Add a new term.

GO:0140030 modification-dependent protein binding ...ISA (new term) ADP-D-ribose modification-dependent protein binding DEF "Interacting selectively and non-covalently with a protein upon ADP-ribosylation of the target protein."

And if all "ADP-D-ribose modification-dependent protein binding" is through binding of ADP-ribose,

GO:0072570 ADP-D-ribose binding ...ISA (new term) ADP-D-ribose modification-dependent protein binding

@hattrill Do you know of any case where ribosylated protein binding may not be also ADP-ribose binding?

pgaudet commented 3 years ago

Probably these represent binding are within proteins? GO:0072572 -> 3 EXP GO:0072571 -> 2 EXP GO:0072570 -> 5 EXP

@raymond91125 would you have a look at the annotations to see if that's the case? If so we could standardize the term labels and definitions, and move the terms at the proper location in the ontology.

Thanks, Pascale

hattrill commented 3 years ago

Sorry @raymond91125 this fell off my list to reply to:

@hattrill Do you know of any case where ribosylated protein binding may not be also ADP-ribose binding?

As far as I can see, it is ADP-ribose only: -poly-ADP-ribosylated proteins are targeted by a variety of proteins that might recognise consecutive ADP-ribosyl units or larger strings often with some contribution from the underlying consensus motif of the target protein.

For mono-ADP-ribosylated proteins, often recognised by distinct sets of mono-ADP-"readers" - it seems to be the ADP-ribose unit that is bound in the context on the mono-form rather than just a terminal ADP-ribose of a chain.

The ADP-ribose of the polymer/monomer maybe modified in someway - O-acetyl-ADP-ribose is the one I have encountered. So probably the definition should have the "ADP-ribose or derivative" claus to encompass every flavour available.

Readers of poly(ADP-ribose): designed to be fit for purpose https://academic.oup.com/nar/article/44/3/993/2502682

raymond91125 commented 3 years ago

Actually all 10 annotations' references showed binding experiments to ADP-ribose, the free molecule. In the case of the TRPM2 proteins, the ADP-ribose is used as a gating regulator of a channel function. In the case of RNF146, it binds poly(ADP-ribose) in vitro and binds to PARsylated proteins in cells.

Probably these represent binding are within proteins? GO:0072572 -> 3 EXP GO:0072571 -> 2 EXP GO:0072570 -> 5 EXP

@raymond91125 would you have a look at the annotations to see if that's the case? If so we could standardize the term labels and definitions, and move the terms at the proper location in the ontology.

Thanks, Pascale

raymond91125 commented 3 years ago

It seems we have a similar situation in SUMO binding vs. sumo-dependent protein binding. That is, sumo-dependent protein binding is through SUMO peptide moiety and yet these two terms are not connected in the ontology.

hattrill commented 3 years ago

Should be annotated in terms of in vivo function - so for this: http://europepmc.org/article/MED/27587838 although in vitro binding poly(ADP)-ribose, target is the modified protein not free molecule.

For https://www.sciencedirect.com/science/article/pii/S0021925819631236 TRPM2 is sensing free ADP-ribose, so genuinely free.

raymond91125 commented 3 years ago

Good point. Following the examples of other PTMs, we should certainly have a MF term of something like "ADP-D-ribose modification-dependent protein binding". What I am concerned about is whether we shouldn't have a relationship between this new term and the term GO:0072570 ADP-D-ribose binding. From my understanding of the field knowledge, "ADP-D-ribose modification-dependent protein binding" is always via binding of the ribose moiety. Thus, I propose, to make annotation and querying easier, in addition to adding the new term to the modification-dependent protein binding branch, we also link it to the ADP-D-ribose binding as an ISA child. But, I am also aware of the fact that the sumolated protein binding and ubiquitinated protein binding are not directly connected to SUMO binding and ubiquitin binding, respectively. So I am seeking some guidance.

ValWood commented 3 years ago

We should consider if we are going to replicate modification dependent protein binding in GO because

  1. Seems unsustainable once we move to phosphorylation etc,

  2. Is incomplete because specific position is important too for modifications that occur at multiple residues. hht1 https://www.pombase.org/gene/SPAC1834.04 binds bir1 active form hht1/InitMet-/Phos:(T3) binds swi6 active form hht1/InitMet-/Me:(K9) (OK we have a precedent for histone readers, but what about all other proteins?)

  3. Does not cover annotations where only the non-modified form is bound/binds e.g. https://www.pombase.org/gene/SPAC1F5.04c cdc12 active form cdc12/UnPhos:(T20,T21,S64,T95,T151,S463) binds cdc15

It seems more informative and consistent to request the modified protein version of a specific protein and use that in column 17. Am I missing something? Is the idea that we always use modified residue binding + column 17 modified form?

hattrill commented 3 years ago

I would say that the solution lies somewhere inbetween: -a limited number of modified protein binding terms (there are only so many mods) as the recognition of a protein+mod is the molecular function of the binding portion of the protein - ie a reader/transducer of signaling marks.

If there is a enough info or need to capture position-specific info, then the appropriate PRO ID could be requested and used as an extension to X-dependent protein binding.

If absolutely everything needs a specific PRO ID, it puts a lot of bioinformatics pressure on the users of this data and presents an activation barrier to potential useful searches e.g. glycosylation-dependent protein binding AND extracellular.

Commiting to using a limited number of MF mod-dependent terms would be very useful and I think, wrt Noctua, it prevents a potential explosion of PRO ID requests to describe basic signaling pathway.

ValWood commented 3 years ago

Sounds sensible, I haven't been using these terms for phosphorylation-dept binding. Maybe I should rectify that.

raymond91125 commented 3 years ago

Take a conservative step by adding the new term "ADP-D-ribose modification-dependent protein binding" but not connecting it to "ADP-D-ribose binding".

pgaudet commented 3 years ago

Hi @raymond91125

According to @hattrill

target is the modified protein not free molecule.

So I thought we were going to either change the term labels or obsolete the existing terms?

You can probably merge the PR for the new term, but does that close the issue?

raymond91125 commented 3 years ago

@pgaudet I'm not sure if we should obsolete the ribose binding terms because:

  1. There are proteins that binds the free molecule ("In the case of the TRPM2 proteins, the ADP-ribose is used as a gating regulator of a channel function.")
  2. Poly-ADP-ribose may be used as an adaptor to bring multiple PAR-binding proteins physically together. (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6702056/)
  3. Of the 10 experimental annotations, I don't believe it is stated that in vivo binding of free ribose molecule is excluded. It would seem reasonable to assume that given the in vitro binding, in vivo binding is possible if not likely, whether or not the ribose is free or being attached to a polypeptide.

@hattrill Please correct me if I'm wrong. Also do you think more should be done about this ticket?

Thanks.

hattrill commented 3 years ago

So, I think we should distinguish free and covently attached to protein forms - should be annotated in terms of in vivo target molecule rather than in vitro experiment (you can get anything to stick in protein crystals at the right conc.).

See lazy mock up: Screenshot 2021-07-01 at 10 12 19

With the free unit, is it only TRPM2 that binds free ADP-ribose. If so, perhaps we should treat this the same as other channel-gating terms such as "intracellularly ATP-gated ion channel activity": Screenshot 2021-07-01 at 10 15 54

pgaudet commented 3 years ago

Thanks for the summary @hattrill

So I think the actin points are

Is this right?

Thanks, Pascale

hattrill commented 3 years ago

Hi @pgaudet

Because of the annotations made with these terms, I would be inclined to: rename:

  1. GO:0072572 poly-ADP-D-ribose binding -> poly-ADP-D-ribose modification-dependent protein binding (from proteins annotated with this term, this should be fine as this reflects the biological target)

  2. New Term: mono-ADP-D-ribose modification-dependent protein binding (as we don't have mARPs as far as I can see from non-automated annotation). Papers such as PMID:23473667; PMID:34023495 (review)

  3. Make new parent term 'ADP-D-ribose modification-dependent protein binding' is_a GO:0140030 'modification-dependent protein binding' and move GO:0072572 and new term 'mono-ADP-D-ribose modification-dependent protein binding' under this term.

  4. Merge: GO:0072570 ADP-D-ribose binding and GO:0072571 mono-ADP-D-ribose binding -> GO:0072570 ADP-D-ribose binding. Spec in definition that this is free molecule. All manual annotations, apart from PARP9, are Trpm2's. Although this channel is the only example of free ADP-ribose binding in vivo, I wouldn't be surprised if there are other similar sensors.

  5. Reannotation for PARP9 - this is an enzyme (NAD+ ADP-ribosyltransferase activity).

  6. And would support something like: 'ADP-ribose/NAD/pyrimidine nucleotide-gated Ca2+ channel activity' for Trpm2's

Hope that makes sense.

pgaudet commented 3 years ago

Yes, thanks for the detailed action points.

raymond91125 commented 3 years ago

PARP9 has already been annotated with NAD+ ADP-ribosyltransferase activity. But it is also regulated by binding to PAR PMID: 28525742.

raymond91125 commented 3 years ago

Added three new terms. Leaving 'free ribose' binding terms separated from 'ribose PTM-dependent protein' binding terms for now. As this scheme follows that of ubiquitin-like binding terms. Further actions involving merging or obsoleting terms may be considered. Thanks for all the inputs!

pgaudet commented 3 years ago

Can this be closed?

pgaudet commented 3 years ago

See PMID: 30097863 PMID: 20652610

raymond91125 commented 3 years ago

PARG can produce protein-free poly-ADP-ribose PMID:8125093 Endoglycosidic cleavage of branched polymers by poly(ADP-ribose) glycohydrolase