geneontology / go-ontology

Source ontology files for the Gene Ontology
http://geneontology.org/page/download-ontology
Creative Commons Attribution 4.0 International
223 stars 40 forks source link

Bona fide molecular activities, and that are currently children on 'binding' only #16742

Closed pgaudet closed 3 years ago

pgaudet commented 5 years ago

To look at with @thomaspd

ValWood commented 5 years ago

Also GO:0060090 molecular adaptor activity and probably a few things under "protein binding" like SNAP receptor activity protein membrane anchor

ValWood commented 5 years ago

Re: GO:0097617 annealing activity GO:0000496 base pairing GO:0008301 DNA binding, bending -> create new parent: nucleic acid remodeling activity ?

I don't think this is quite right. These are really only geometric changes (at least the bending in this context), I don't think it's "remodelling" which refers to chromatin.

ValWood commented 5 years ago

GO:0061777 DNA clamp activity is an activity not just binding (it "clamps" DNA molecules).... https://en.wikipedia.org/wiki/DNA_clamp

ValWood commented 5 years ago

GO:0003689 DNA clamp loader activity GO:0061860 DNA clamp unloader activity -> move under 'protein folding chaperone' ? PMID 16082778 states "AAA+ ATPases mediate chaperone-like protein remodeling."

These aren't protein folding chaperones. They do change the protein conformation during loading. They might need their own grouping term. I don't know much about the clamp loaders but we also have cohesin loaders/unloaders.

https://github.com/geneontology/go-ontology/issues/12530

ValWood commented 5 years ago

"sensor activity" would be a very useful term

pgaudet commented 5 years ago

@ValWood WRT DNA clamp: The wiki page states 'A DNA clamp, is a protein fold' (...) A protein fold is a structure, not a function. Don't you think ?

pgaudet commented 5 years ago

thanks Midori :)

deustp01 commented 5 years ago

Trying to parse Wikipedia usage, the phrase "protein fold" only occurs in the summary introduction and is neither used nor explained anywhere else in the article, which goes on at length about the relationship between the structure of the clamp, its assembly and disassembly, and the relationship of these to its function.

pgaudet commented 5 years ago

@deustp01 the rest of the first sentence is 'a protein fold that serves as a processivity-promoting factor' - we already have a term for that: 'GO:0030337 DNA polymerase processivity factor activity'

Assembly and disassembly are BPs, not MFs. (but maybe that was your point?)

PD: nothing that subtle, only that the steps of binding and conformational change are ointyertwined with steps of assembly.

ValWood commented 5 years ago

The activity isn't "protein fold' though (that's not a great part of the description). It binds to ds DNA in a very specific way, so that it can migrate along the molecule. It increased processivity, I think partly by keeping newly synthesised strands together?

pgaudet commented 5 years ago

Can you not use 'GO:0030337 DNA polymerase processivity factor activity'? Is the clamping a different function?

pgaudet commented 5 years ago

or the mechanism ?

ValWood commented 5 years ago

Processivity is one activity, but it also has a clamp activity (holding DNA together). In the same way that cohesin has multiple activities , one being its very special type of DNA binding activity (DNA entrapment, to loosley hold 2 DNA molecules together, but not to clamp them).

mah11 commented 5 years ago

WRT DNA clamp: The wiki page states 'A DNA clamp, is a protein fold' (...) A protein fold is a structure, not a function. Don't you think ?

That protein fold is a feature of proteins that have DNA clamp activity; it's not a description of what the clamp activity does. A clamp encicles DNA, and binds to a DNA polymerase to keep the polymerase associated with the DNA (so binding things is part, but not all, of a clamp's activity).

If the question is about 'clamp loader' and 'clamp unloader' activities, I think the Wikipedia page doesn't really address them, but my understanding is that clamp loading an unloading don't entail "folding"-scale changes.

pgaudet commented 5 years ago

I have less of a problem with the loading and unloading (other than finding them a proper place in the ontology); it's the clamp itself I find quite odd.

ValWood commented 5 years ago

I think we need these functions to model the processes properly.

We went to long lengths to get these 2 types of DNA binding activity differentiated https://github.com/geneontology/go-ontology/issues/12529

They are quite critical to model the processes of cohesin and replication. Cohesn particularly where "entrapment" is the major activity.

pgaudet commented 5 years ago

OK... can you suggest any parent other than binding ? (Otherwise they can also just stay where they are - I was trying to see what terms should also be considered true molecular activities)

mah11 commented 5 years ago

Processivity is one activity, but it also has a clamp activity (holding DNA together).

I disagree. I'm with Pascale - if there's any difference between "DNA clamp activity" and "DNA polymerase processivity factor activity" it would be some part of the processivity-increasing action that isn't covered by binding DNA. And that may be a finickier distinction than we find useful.

In the same way that cohesin has multiple activities , one being its very special type of DNA binding activity (DNA entrapment, to loosely hold 2 DNA molecules together, but not to clamp them).

This isn't like the cohesin situation - a sliding clamp encircles one DNA duplex (which is "held together" by the usual hydrogen bonds between strands). The association with DNA is a bit special in that the sliding clamp, well, slides along DNA rather than staying put in one location.

If you do want to distinguish "DNA clamp activity" from "DNA polymerase processivity factor activity", then I think the way to do it is to define "DNA clamp activity" as the type of DNA binding that encircles one DNA duplex and allows sliding, and then define the processivity factor activity as the combination of clamp-style binding and protein binding that prevents the polymerase from dissociating from DNA, thereby increasing polymerase processivity. That implies the link "DNA polymerase processivity factor activity" has_part "DNA clamp activity". (I wouldn't mind merging, though; in the replication field people will be satisfied with the "DNA polymerase processivity factor activity" term.)

mah11 commented 5 years ago

p.s. in case it wasn't clear above, a DNA sliding clamp doesn't hold DNA together, it holds a polymerase on to DNA (and I'm not sure clamps encircle DNA "tightly" - they may not be as loose as cohesin but they're not too tight to move along DNA)

ValWood commented 5 years ago

Frank Ullman told me that cohesin was much looser than clamps.

Cohesin also needs to slide along the DNA....but the ring is much bigger and looser and encircles 2 DNA molecules (when fully loaded) not one..

ValWood commented 5 years ago

Maybe we don't need the DNA clamp activity if "processivity factor" can be used. But we definitely need the "DNA entrapment activity" for cohesin, and I can't think of a suitable non-DNA binding parent for this either...

pgaudet commented 5 years ago

DNA clamp activity has 2 annotations, hus1 and rad9. Are they processivity factors ? They dont have the annotations.

mah11 commented 5 years ago

Hmm, good question; I'd kind of forgotten about the "checkpoint" clamp (aka 9-1-1 complex, from Rad1, Hus1, Rad9). From 10 minutes' worth of checking, it seems to play a role (or roles) in DNA repair that isn't as well characterized as the effect of replication clamps on DNA polymerase processivity. I'm getting this mainly from PMID:25925573, which I just found now.

So maybe it helps make the case for keeping separate terms, and GO:0030337 has_part GO:0061777 ... I'm not sure because I don't know repair anywhere near as well as replication.

ValWood commented 5 years ago

No this is a PCNA related "checkpoint clamp complex"

ValWood commented 5 years ago

I don't think we know that the specific type of binding is actually a bona fida activity in the same way that cohesin is, so maybe there terms aren't required.......

Most of the stuff I see is just related to "recruitment"

ValWood commented 5 years ago

I'd be happy for "clamp" to obsolete since for PCNA "processivity factor" is fine and for the other Clamps we don't know enough.

I don't mind as long as we keep the GO:0061776 topological DNA entrapment activity for cohesin, so we need a parent for that. I can't think of anything immediately.

We also need an MF term to describe the activity of "recruitment", one day I will open a ticket for that....

pgaudet commented 5 years ago

For GO:0097617 annealing activity GO:0000496 base pairing GO:0008301 DNA binding, bending

@thomaspd had suggested in his MF proposal to group these under a new term 'DNA structure modulator' - what do people think ?

That would also include topoisomerease and helicase

ValWood commented 5 years ago

I'm not keen. I'm not hugely keen on the groupings we have for transcription, but pragmatically, these are very useful for curators and users to locate the terms relevant to these processes in the ontology, and I like the resulting clean up. It makes everything much easier for users and curators, and will promote annotation consistency.

However, importantly (and why I can live with them) most people in the transcription domain will understand (and may search for), "general transcription factor" and "DNA binding transcription factor" and have a pretty good idea what they expect to retrieve.....(longer term we still need to model the "actual" activities under some of these pragmatic grouping nodes, like kinases and ubiquitin ligases etc, to ensure consistent curation).

I would be wary of introducing grouping terms that are not in general use, because grouping dissimilar activities together in the MF ontology is really against good ontology principles. I'm all for pragmatic solutions, but we shouldn't IMHO make this a general practice, and we definitely shouldn't establish grouping terms which aren't immediately recognizable to biologists.

"hijacked molecular function" comes to mind.

Personally, I would live with them connected to the root node. Often these are terms that get a bit hidden in the function ontology. Bringing them to MF will enable us to see if there really are any common functional connections.

Often we know quite a lot about the function of a gene product, but our best option is a "binding term". We aren't used to thinking about activities outside of the easy to define groups (transporter, enzyme etc). Things might become clearer about how to group these if we could see them at the root node.

pgaudet commented 5 years ago

"hijacked molecular function" comes to mind.

Will get obsoleted

I would live with them connected to the root node.

All of these ?

GO:0097617 annealing activity GO:0000496 base pairing GO:0008301 DNA binding, bending

That's a lot of terms ! They can also stay under binding if that seems better.

ValWood commented 5 years ago

Well topoisomerase and helicase are enzymes.... but the others maybe they would be better at the root? I don't know....maybe a suitable grouping will then emerge. Sometimes it's just that we don't have the semantics to describe what they are doing molecularly, or we haven't thought about it for long enough...

I have previously questioned the existence annealing and base pairing...so the old tickets might help to explain what the are functionally.... I'll have a dig around tomorrow...

ValWood commented 5 years ago

When annealing is an activity isn't it enzymatic? https://www.ebi.ac.uk/QuickGO/term/GO:0097617 Do we have examples of non-enzymatic annealing? That wouldn't be gene dependent would it?

ValWood commented 5 years ago

annealing https://github.com/geneontology/go-ontology/issues/12465 https://github.com/geneontology/go-ontology/issues/11126 https://github.com/geneontology/go-ontology/issues/11108 there are more

ValWood commented 5 years ago

condensin is able to convert ss complementary DNA to ds DNA in virto

https://www.researchgate.net/publication/230600070_Condensins_Universal_organizers_of_chromosomes_with_diverse_functions/figures?lo=1

this appears to be by the displacement of SS-binding proteins https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4281712/pdf/rsob-4-140193.pdf In this study, we employed an in vitro approach, building upon our previous study, which demonstrated the ability of the condensin SMC heterodimer Cut3– Cut14 to remove single-stranded (ss) DNA-binding protein RPA or Ssb1, which had been bound to the unwound ssDNAs [12]. As the elimination of protein and/or RNA during the re-annealing reaction per se did not require ATP, we wondered how ATP interacts with condensin’s ATPase domain during the re-annealing...

so, what is the activity?

ValWood commented 5 years ago

base -pairing I could not figure out why this term existed until I drilled down and got to the sensible terms like GO:0000332 template for synthesis of G-rich strand of telomere DNA activity in the DNA branch,

and GO:0030555 RNA modification guide activity GO:0030557 tRNA modification guide activity

Recommend obsoleting the "base-pairing" terms (or merging if they have been correcty used

replacing witha term along th lines of nucleic acid template activity -DNA template activity -RNA template activity

As the commonailty here is providing a specific (guide RNA, codon/anticodon) template for complementary base-bairing, not the base pairing itself.

pgaudet commented 5 years ago

Moved the base-pairing discussion here: #https://github.com/geneontology/go-ontology/issues/16759

Pascale

pgaudet commented 4 years ago
ValWood commented 3 years ago

cell adhesion mediator activity is another "binding only" term

pgaudet commented 3 years ago

Decided to leave annealing activity under 'nucleic acid binding' in discussions for #21614