Open hattrill opened 3 years ago
should be GO:1990904 ribonucleoprotein complex (like the subunits)
Can't it be both an intracellular non-membrane-bounded organelle and a complex?
I dont think this is consitent with our rules.
As @hattrill points out, for protein-containing complex the geneproduct2term relation is part_of, while for cellular anatomical strucutres, it's located_in or is_active_in (depending on whether the CC data shows where the gene is active, as opposed to just a localization assay).
So, I dont think a term can be a is_a child of BOTH complex and cellular anatomical entity.
@vanaukenk @cmungall @ukemi @balhoff
And, just for fun - polysome is_a GO:1990904 ribonucleoprotein complex
That is indeed very entertaining :D
Le jeu. 18 mars 2021 à 6:51 PM, Helen Attrill @.***> a écrit :
And, just for fun - polysome is_a GO:1990904 ribonucleoprotein complex
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/geneontology/go-ontology/issues/21143#issuecomment-802161656, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABEPWUCTA6EFLEIKKQZDDQ3TEI4SVANCNFSM4ZMJKE2A .
I came across this ticket while investigating some of our annotation warnings. A couple of them trace to the ancestry of GO:0042788 ! polysomal ribosome.
Can't [ribosome] be both an intracellular non-membrane-bounded organelle and a complex?
I don't think this is consitent with our rules.
... for protein-containing complex the geneproduct2term relation is part_of, while for cellular anatomical strucutres, it's located_in or is_active_in
So, I dont think a term can be a is_a child of BOTH complex and cellular anatomical entity.
This reasoning looks backwards to me. Surely the links within the ontology should be determined by the actual biology that the ontology represents, and then the gp-to-term relation rules should be adapted if necessary.
Historically the ribosome has sometimes been counted as an organelle because of its size and importance; at the same time, it's unquestionably a complex of RNA and protein. So I think biologists would find ribosome is_a ribonucleoprotein complex
AND ribosome is_a organelle
intuitive. If this is too sloppy for ontological formality, that's a bit of a loss for biologist-friendliness, but it would be a far better reason for choosing one or the other classification within the ontology than rules about annotations that use the ontology.
Also, as @hattrill noted, ribosome (GO:0005840) and polysome (GO:1990904) aren't handled consistently at present. Although there is the mRNA present in a polysome, it's not clear why that makes a polysome count as a ribonucleoprotein complex when a "plain" ribosome doesn't. If ribosome can only be in one branch, the case for "complex" seems slightly stronger.
I don't see why ribosome can't be both organell and ribonucleoprotein complex.
polysome is_a ribonucleoprotein complex seems odd to me because it's not a single complex.
polysome is_a ribonucleoprotein complex seems odd to me because it's not a single complex.
Maybe the idea is that the mRNA connects all the ribosomes up into one humongous complex? Anyway, I agree it's a bit odd but not as odd as ribosome not being is_a
RNP ...
Maybe the idea is that the mRNA connects all the ribosomes up into one humongous complex?
That seems a bit weird. They are only connected by being on the same mRNA (like beads on a string). But they are still discrete complexes.
I don't see why ribosome can't be both organell and ribonucleoprotein complex.
One major problem and why it's on our radar as an issue: Now we have gp2term relations, it can't be both - part_of a complex and located_in an organelle.
Now we have gp2term relations, it can't be both - part_of a complex and located_in an organelle
But lots (most?) of complexes are located in organelles! Maybe part_of complex takes precedence over located_in organelle?
There's only one slot for the gp2term rel. but I guess I would argue that the protein is part of a complex and the complex is located in the organelle.
It does look like an edge case. For that matter, if one had the patience to list all the components, a chromosome or a nucleolus could a complex (or a series of complexes for each, with transformations between them).
I put this to discuss with GO editors on out next call, Nov 15th.
So was there a decision about whether we can add "ribonucleoprotein complex" as a parent to ribosome?
@ValWood Because of the Geneproduct2term rules, we cannot have both 'cellular anatomical entity' and 'protein-containing complex' as parents of the same term.
Another way to think about this may be, how do we want to annotate this? two possible statements:
Is one or the other better?
Thanks, Pascale
Part_of ribosome sounds more correct. is_active_in ribosome sounds weird...but I don't know the implications
IMO not having ribosome under organelle is odd. We also seem to be going on vibes here rather than strict criterion. Why not also move (for example) BMC?
How about aligning with ComplexPortal? We will never have a perfect definition of IMBO vs complex that is both actionable and aligns with biologist expectations but we can at least strive to be consistent with other databases.
I don't know their exact guidelines, but it looks like CPX includes subunits but not the whole ribosome (e.g https://www.ebi.ac.uk/complexportal/complex/CPX-5223 human 40S). So we could have ribosome is-a IMBO, and subunit isa (rnp) complex.
Note we should always have annotations at the level of the subunit so that helps with the bad vibes people get from using the cc relationships with ribosome
IMO ribosome and it's subclasses should all be do-not-annotate. All the direct annotations I see there are problematic. E.g.
if one had the patience to list all the components, a chromosome or a nucleolus could a complex
... and both, like the ribosome, would be intracellular non-membrane-bounded organelles. (I understand that this may not be a particularly helpful comment right now.)
I don't know if this helps either, but I had a lot of banter with CHatGPT to try to tease out what constitutes a non-membrane-bound organelle.
philosophising with ChatGPT.txt
It seems that permanence is a criterion for ChatGPT (which seems a bit odd to me since the nucleus isn't permanent for organisms that do open mitosis, so the criteria for "organelle" is different for membrane-bound and non-membrane-bound). This almost seems to be a case of biological misappropriation of the useful term "organelle" and muddying the waters to mean "something we can see" (which is bad).
I also tried ChatGPT. I found it basically reflects an inherent fuzziness here and you can manoever it into any position (IMO not so different than working with experts to make ontology definitions).
I got it to summarize key differences; these are inherently a bit fuzzy but I think this is reflective of the collective literature on this:
IMO this means you could place many structures in either or both so it's best just going with something that can be easily and consistently applied, and it's useful for pragmatic reasons (e.g. alignment with CPX), and we just enumerate edge cases as inclusion/exclusion lists.
This seems to be one of the terms that was coined through microscopy originally (we can see it), which is a moving goalpost. None of the other criteria stacks up either: The cytoskeleton is currently an organelle but is inherently dynamic. The ribosome is currently an organelle has a specific task (many smaller complexes have more diverse functions).
In this case, why do we need non-membrane bound organelle as a grouping term? Is it a useful superclass? I have never found it a useful (for the reasons above). None of the criteria are clear. They could all just be "anatomical structures" (Intracellular or extracellular). That make the problem go away otherwise we will. be stuck in a lop forever...
(I don't understand the alignment with CPX problem though)
Out of curiosity, I just took a look at how MeSH handles these concepts. Although they probably don't have it 100% right either, it might be worth looking at MeSH together on an ontology call to see if that can help us better formulate our distinctions in GO. (Although I suspect this might have been done early on in the creation of the CC branch, but don't know for sure.)
MeSH has the concept of a cellular 'structure' that seems to be in between an organelle, e.g. nucleus or Golgi, and smaller (both in size and number of members) protein-containing complexes such as we have in GO, e.g. transcription factor AP-1 complex.
In GO, perhaps MFs could 'occur in' organelles and structures, but not protein-containing complexes.
And FWIW, MeSH considers the ribosome an organelle.
One very practical way to assess this is the reason why this came up: in GO-CAM, which statement do we want to make for a CC:
for the chromatin I cannot think of a MF it would enable, so that seems OK to classify it as a cellular anatomical entity
for the ribosome: which statement do we want to make:
For me, occurs_in ribosome sounds odd.
This is what the draft guidelines here aim to address:
A protein-containing complex should enable a single biological function. For example, the ribosome is a complex because it enables translation (addition of amino acids to a polypeptide chain), even though different subunits may have different specific molecular functions. Large macromolecular structures to which no single function can be assigned, such as the chromatin and the kinetochore, are classified in GO as cellular anatomical entities.
Thanks, Pascale
Do we want to make statements about the ribosome or statements about the subunits? See my proposal above
I'm still confused about the wording of the proposed def. Is it function in the broader sense (e.g. translation) or molecular function only?
I think that @pgaudet MF def works best for me when trying to distinguish these large structures. We have sometimes thought in terms of 'molecular machines' in GO - a large complex such as a ribosome or F1F0 ATPase is a molecular machine - the components work together to achieve an MF.
Whereas, an organelle is a compartment which provides an a separate 'environment' in which processes occurs.
It may be fuzzy in some areas, but thinking in terms of a GO-CAM makes it easier. I would rather have a ribosome as a complex that does something than a compartment where some things happen.
I share @cmungall 's concern about the definition.
The def refers to 'biological function' and 'molecular function' and also references a GO BP, translation, as an example of something a complex 'enables' which is, in fact, our gp2term relation for a gene/gene product(s) to a GO MF.
"A protein-containing complex should enable a single biological function. For example, the ribosome is a complex because it enables translation (addition of amino acids to a polypeptide chain), even though different subunits may have different specific molecular functions."
From this it seems to me that what we're trying to say is that the subunits of a protein-containing complex may enable different MFs, but those MFs work together to achieve (i.e. are part of) one main GO BP, although I imagine there are cases where other GO BPs are part of that main BP.
Can we discuss this on the next editors call?
Currently, ribosome GO:0005840 is_a intracellular non-membrane-bounded organelle
I know this could be debated, but I think that most researchers would call this a complex. Could we move this under "GO:0032991 protein-containing complex"? Especially as now we are using gp2term rels, located_in seems jarring.