Closed ValWood closed 6 years ago
The term "transcription ternary complex", the one child term of "protein-DNA-RNA complex", is the one thing I thought of that would be a complex of protein, DNA, and RNA, but it doesn't have any annotations either. That doesn't seem very surprising since I've only heard the term "ternary complex" when I worked in a transcription lab, not in papers doing routine analysis of what transcription factors are important for their favorite gene. Perhaps it would be useful to add "transcription elongation complex" as a synonym for "transcription ternary complex".
Anyway, I can't think of any reason why it would be a problem to make "protein-DNA-RNA complex" a type of "protein-DNA complex".
Hi Karen, how doe it relate to existing grouping term? http://www.ebi.ac.uk/QuickGO/GTerm?id=GO:0008023#term=ancchart val
The "ternary elongation complex" is the complex of the RNA polymerase, the DNA template, and the RNA transcript. A "transcription elongation factor complex" might bind to the "ternary elongation complex" to regulate the elongation properties of the RNA polymerase, but a "ternary elongation complex" is not a type of "transcription elongation factor complex".
So which gene products would you annotate to a ternary complex other than the RNA polymerase subunits? why not have it as a related synonym of RNA polymerase?
I agree with David Hill's suggestion
Rename macromolecular complex to be protein-containing complex
Merge protein complex with protein-containing complex
My interest is in the telomerase protein-RNA complex or Telomerase (holoenzyme) complex for which the hierarchy as it currently stands is_a GO:0030529 intracellular ribonucleoprotein complex which is_a GO:1990904 ribonucleoprotein complex which is_a GO:0032991 macromolecular complex.
Yes it would be nice if the hierarchy higher up showed that telomerase falls under 'protein-containing' complex.
wrt step 2, based on discussion during GO eds call this morning.
Why not move 'protein complex' (meaning protein only complex) to be a child of 'protein-containing complex' (was 'macromolecular complex') and make sure that children of 'protein complex' that should really be under protein-RNA or protein-DNA or protein-ligand complex are moved to those parents instead?
What are the cons of doing it this way? Too much work for too little gain?
Tanya,
The reason we are trying to come up with an overall term is because users filter for 'protein complex' expecting to find ALL complexes, incl those containing non-protein participants.
And, the whole class of 'protein complex' is not really consistent. Each time we find an example with a non-protein group, the whole branch gets moved directly under 'macromolecular complex' and users lose it when filtering by 'protein complex'.
ok. here is a "slightly" bonkers idea:
rename macromolecular complex to protein-containing complex (or simply protein complex) rename current protein complex to protein only complex
does this not mean that all current complexes would still fall under the correct branch (without more work) and, all complexes (whether protein only, protein plus prosthetic groups, protein plus nucleic acid) will be pulled out if searching for protein(-containing) complexes???
(mind you probably what I suggested is more than 'slightly' bonkers and I am missing something huuuuge)
Nancy, that was pretty much discussed above. I thought we were going to re-name 'macromolecular complex' to 'protein-containing complex'?
However, 'protein-only complex' doesn't work for the current 'protein complex' class as it still contains many examples of not-only-protein protein complexes (and none has the time to investigate every term, we fixed them if and when we found them...). Hence why we were asking to obsolete that class and merge it with 'macromolecular complex'.
And there are many children of macromolecular complex that should be children of protein-only complex. So at the end of the day is it worth the work to go through all the direct children of macromolecular (protein-containing) complex and sort them with respect to whether they only contain proteins, and go through the current children of protein (protein-only) complex and try to pull out all the ones that have members that contain more than just protein?
If we think that it is important for our users to find complexes that contain only proteins then we should keep the protein complex class. Perhaps then we should just go through and move all the current children of macromolecular complex that we suspect contain only proteins to protein complex. This will also require the clean up of a lot of axioms that currently are not necessary and sufficient because the genus should be 'protein complex' rather than 'macromolecular complex'. If we then discover that a child of protein complex has members that contain more than just proteins, we create an SOP where we either move it to be a child of macromolecular complex and create new children of the protein-only and other class, or an SOP where we rename the existing complex to be 'protein-only-containing complex X' or a better name if it exists and we create a new 'protein-Somethingelse complex' as a child of the appropriate sub-type of macromolecular complex.
In either case, if we decide it is valuable to have the distinct protein-only class, we need to do a clean-up of all of the current direct children of macromolecular complex. I think the proetin-only class would be required for true exhaustiveness in the ontology, just an aside.
My 2c, keep the ontology simple. But at the same time ensure annotations are as complete as possible. We can simply have templated amigo queries like these http://amigo.geneontology.org/grebe to retrieve complexes with non-protein members vs no known non-protein members.
This is a good idea in general, but since we don't annotate the non-protein parts, they aren't available for query. This will work for other cases where we do make distinctions in the annotations. See this ticket for a current discussion:
I think it might be a good strategy to not distinguish GO complexes by membership and bring them to the level of functional conservation. But I waiver on this.
The RNA of RNase P enzymes might be annotated (eg, Rpph1; which I will annotate today!).
@bmeldal and @ukemi : thanks for the additional information. No more questions (or objections) from me about the potential merge. That seems like the most pragmatic course of action.
@pgaudet Since you are our expert at merging terms, let's put this on at the top of our list for the ticket workshop. OK?
Hi,
Getting ready to start on this: The action is to merge: 'GO:0043234 protein complex ' GO:1990904 ribonucleoprotein complex' 'GO:0032992 protein-carbohydrate complex' 'GO:0032993 protein-DNA complex' 'GO:0001114 protein-DNA-RNA complex' 'GO:0032994 protein-lipid complex' 'GO:1990684 protein-lipid-RNA complex'
into macromeolecular complex- Correct ?
@ukemi @bmeldal @vanaukenk
Thanks, Pascale
I thought we were just going to merge 'protein complex' and all associated terms (regulates, assembly etc) into 'macromolecular complex' terms and rename 'macromolecular complex' to 'protein containing complex'.
OK, so you want to keep the other terms - is all relevant complexes are under the correct parent ? (it seems we might have the same issue). Either way is fine for me.
I am not sure we should rename 'protein-containing complex' - doesn't it depend on what you annotate ?
Hi all,
Above we agreed to rename macromolecular complex to protein-containing complex and making macromolecular complex exact synonym. That way users will hopefully find this term when searching for the obsoleted protein complex complex term (which should have a comment to refer them to protein-containing complex).
We agreed to keep the protein-X complex classes and move terms in and out of these specific sub-classes if and when we have new knowledge of non-protein molecules now being members of a complex.
One of the main reasons to do this was because users were searching for protein complex and missing things like ribosomes and then complaining (@ValWood 's standard example :) )
Once this change is done we may want to send out a message to users (tweet?) that this change has occurred as it's quite signification.
Birgit
PS: I thought you guys are in Denver, have you reached insomnia stage???
PPS: Personally, I'd be happy to simply merge protein complex into macromolecular complex but there were arguments above for keeping the term "protein" in this top-level class name.
macromolecular complex to protein-containing complex and making macromolecular complex exact synonym How about 'macromolecular complex ' being a 'related' synonym? We cannot say it's exact.
(I'm in Geneva, but I am also worrried about David!)
I hope you don't have to work til late at night to keep in touch with them in Denver ;-)
I don't mind what it's called as long as there are synonyms.
I don't think we really need any of the terms to classify complexes by the type of molecule they contain. You could get this information another way if you really wanted it.... you would just query for rRNA or whatever, and then "Macromolecular complex". It's not a distinction I have been aware of users ever wanting to make though...but it would be information that would be pretty trivial to obtain bioinformatically using GO (and much more accurately than using the current annotation).
Hello,
Here's what I did:
MERGED
id: GO:0032984 -name: macromolecular complex disassembly +name: protein-containing complex disassembly GO:0043241 protein complex disassembly +alt_id: GO:0043241
id: GO:0034622 -name: cellular macromolecular complex assembly +name: cellular protein-containing complex assembly +alt_id: GO:0043623 cellular protein complex assembly
id: GO:0065003 -name: macromolecular complex assembly +name: protein-containing complex assembly +alt_id: GO:0006461 protein complex assembly
id: GO:0044877 -name: macromolecular complex binding +name: protein-containing complex binding +alt_id: GO:0032403 protein complex binding
id: GO:0043933 -name: macromolecular complex subunit organization +name: protein-containing complex subunit organization +alt_id: GO:0071822 protein complex subunit organization
RENAMED id: GO:0034367 -name: macromolecular complex remodeling +name: protein-containing complex remodeling
id: GO:0043933 -name: macromolecular complex subunit organization +name: protein-containing complex subunit organization
id: GO:0044877 -name: macromolecular complex binding +name: protein-containing complex binding
id: GO:0065003 -name: macromolecular complex assembly +name: protein-containing complex assembly
id: GO:0097695 -name: establishment of macromolecular complex localization to telomere +name: establishment of protein-containing complex localization to telomere
id: GO:1904913 -name: regulation of establishment of macromolecular complex localization to telomere +name: regulation of establishment of protein-containing complex localization to telomere
id: GO:1904914 -name: negative regulation of establishment of macromolecular complex localization to telomere +name: negative regulation of establishment of protein-containing complex localization to telomere
id: GO:1904915 -name: positive regulation of establishment of macromolecular complex localization to telomere +name: positive regulation of establishment of protein-containing complex localization to telomere
@bmeldal @ukemi @ValWood
Please let me know if it's OK
I would be happy to further merge 'cellular xxx' if needed ;)
Thanks, Pascale
Thanks, @pgaudet
id: GO:0034622 -name: cellular macromolecular complex assembly +name: cellular protein-containing complex assembly +alt_id: GO:0043623 cellular protein complex assembly id: GO:0065003 -name: macromolecular complex assembly +name: protein-containing complex assembly +alt_id: GO:0006461 protein complex assembly
There are extracellular complexes so I guess the distinguishing terms were developed for that???
@bmeldal A naive question: Are there extracellular assembly factors ? I assumed that these get assembled intracellularly and exported.
I can see the value in 'intracellular protein complex' and 'extracellular protein complex', but not relly in their assembly.
(anyhow if the rest seems OK I'll go ahead and merge - please let me know :)
Pascale
No idea, I've not come across it but never looked at it either.
Maybe @deustp01 knows as Reactome curate the assembly steps, CP doesn't.
Hi @pgaudet ,
The above all seem correct. But, I thought there were even more terms that had to do with protein complexes like, regulation of protein complex stability. regulation of protein complex disassembly etc. That's why I've been putting off this ticket for so long. :)
Terms that remain containing 'protein complex' :
DONE GO:0031503 protein complex localization -> rename "protein-containing complex localization"
DONEGO:0034629 cellular protein complex localization -> rename "protein-containing complex localization"
DONEprotein complex scaffold activity -> rename "protein-containing complex scaffold activity"?
DONEGO:0090126 protein complex assembly involved in synapse maturation -> rename "protein-containing complex assembly involved in synapse maturation" ?
protein complex oligomerization: 12 EXP by SGD (@srengel and UniProt @ggeorghiou ) -> Can I perhaps merge into 'protein oligomerization? WAIT FOR FEEDBACK
'protein complex involved in cell adhesion': 12 direct EXP: dictyBase (@pfey), MGI (@ukemi), IntAct (@bmeldal ) 'protein complex involved in cell-cell adhesion': 2 direct EXP: dictyBase (@pfey) 'protein complex involved in cell-matrix adhesion': = direct annotations.
I think these need to go away.
These should also be obsoleted.
@ukemi @ValWood @bmeldal what do you think ?
Thanks, Pascale
- protein complex scaffold activity -> rename "protein-containing complex scaffold activity"?
Yes, I think so.
I think these need to go away
Although no objections, but why? I though biogenesis is a whole branch...
'mitochondrial respiratory chain complex I biogenesis
Yes, if its assembly, these terms can be used https://www.ebi.ac.uk/QuickGO/term/GO:0033108 if its something else (transcription, etc) , the appropriate expression terms can be used.
@bmeldal 'Biogenesis' for a protein is translation, isn't it? Unless there is something special about the translation of the protein making up these complexes ???
Or were you talking about 'complex involved in process'? This is clearly a dangerous path....
Thanks, Pascale
I could merge x biogenesis into assembly for those, not sure that this is what the annotations were trying to capture. Looks rather like regulation of expression.
I haven't used the biogenesis terms so better ask those who have. If it's just translation than there shouldn't be any specific x protein biogenesis terms unless something special happens :)
historically biogenesis terms were created for some processes when they knew that the production of something was affected but were not sure whether it was the transcription, translation, assembly etc.
The only strong case for keeping is "ribosome biogenesis" which researchers use to include rRNA processing, assembly and export from the nucleus because some of the steps don't appear to be separable (at least currently).
Well, I suppose that 'biogenesis' of a protein could mean other things besides translation depending on what you were referring to (ie, posttranslational events)
But then we have 'protein modification.... '
which, again, depending on what protein form you are referring to, would be included in biogenesis. It's a fairly broad grouping term.
yes it's historic, we shouldn't need them. If you can't be sure which process , don't make the annotation....
WooHoo. I agree with @bmeldal this is quite a big change, maybe a post on go friends and the consortium list just as a heads up?
Thanks @pgaudet for taking this on. It was a very complicated merge/rename.
No problem!
I create 3 new tickets for the outstanding issues. Closing this one.
Thank you everyone!!! I feel like celebrating! I think I first discussed this topic with the then EBI editors 5 years ago :)
We just had a IntAct/CP release but I will tweet about these changes next week. Leaving our release tweets on top of the news feed for a few days.
95 comments!
No objections from here. We annotate the assembly of a complex to capture distinct functions mediated by the complex at various stages of its assembly, or to capture interactions with other physical entities that affect distinct steps of the assembly process, and we treat the assembly process as part of whatever process the complex itself mediates, not as a distinct process in its own right, so these changes in GO should not affect us,
As I can't see the changes until they go public: @pgaudet
Yes and yes
does not have the parent protein complex