geneontology / go-ontology

Source ontology files for the Gene Ontology
http://geneontology.org/page/download-ontology
Creative Commons Attribution 4.0 International
223 stars 40 forks source link

ribosome GO:0005840 intracellular non-membrane-bounded organelle #21143

Open hattrill opened 3 years ago

hattrill commented 3 years ago

Currently, ribosome GO:0005840 is_a intracellular non-membrane-bounded organelle

I know this could be debated, but I think that most researchers would call this a complex. Could we move this under "GO:0032991 protein-containing complex"? Especially as now we are using gp2term rels, located_in seems jarring.

ValWood commented 3 years ago

should be GO:1990904 ribonucleoprotein complex (like the subunits)

Can't it be both an intracellular non-membrane-bounded organelle and a complex?

pgaudet commented 3 years ago

I dont think this is consitent with our rules.

As @hattrill points out, for protein-containing complex the geneproduct2term relation is part_of, while for cellular anatomical strucutres, it's located_in or is_active_in (depending on whether the CC data shows where the gene is active, as opposed to just a localization assay).

So, I dont think a term can be a is_a child of BOTH complex and cellular anatomical entity.

@vanaukenk @cmungall @ukemi @balhoff

hattrill commented 3 years ago

And, just for fun - polysome is_a GO:1990904 ribonucleoprotein complex

pgaudet commented 3 years ago

That is indeed very entertaining :D

Le jeu. 18 mars 2021 à 6:51 PM, Helen Attrill @.***> a écrit :

And, just for fun - polysome is_a GO:1990904 ribonucleoprotein complex

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/geneontology/go-ontology/issues/21143#issuecomment-802161656, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABEPWUCTA6EFLEIKKQZDDQ3TEI4SVANCNFSM4ZMJKE2A .

mah11 commented 3 years ago

I came across this ticket while investigating some of our annotation warnings. A couple of them trace to the ancestry of GO:0042788 ! polysomal ribosome.

Can't [ribosome] be both an intracellular non-membrane-bounded organelle and a complex?

I don't think this is consitent with our rules.

... for protein-containing complex the geneproduct2term relation is part_of, while for cellular anatomical strucutres, it's located_in or is_active_in

So, I dont think a term can be a is_a child of BOTH complex and cellular anatomical entity.

This reasoning looks backwards to me. Surely the links within the ontology should be determined by the actual biology that the ontology represents, and then the gp-to-term relation rules should be adapted if necessary.

Historically the ribosome has sometimes been counted as an organelle because of its size and importance; at the same time, it's unquestionably a complex of RNA and protein. So I think biologists would find ribosome is_a ribonucleoprotein complex AND ribosome is_a organelle intuitive. If this is too sloppy for ontological formality, that's a bit of a loss for biologist-friendliness, but it would be a far better reason for choosing one or the other classification within the ontology than rules about annotations that use the ontology.

Also, as @hattrill noted, ribosome (GO:0005840) and polysome (GO:1990904) aren't handled consistently at present. Although there is the mRNA present in a polysome, it's not clear why that makes a polysome count as a ribonucleoprotein complex when a "plain" ribosome doesn't. If ribosome can only be in one branch, the case for "complex" seems slightly stronger.

ValWood commented 3 years ago

I don't see why ribosome can't be both organell and ribonucleoprotein complex.

polysome is_a ribonucleoprotein complex seems odd to me because it's not a single complex.

mah11 commented 3 years ago

polysome is_a ribonucleoprotein complex seems odd to me because it's not a single complex.

Maybe the idea is that the mRNA connects all the ribosomes up into one humongous complex? Anyway, I agree it's a bit odd but not as odd as ribosome not being is_a RNP ...

ValWood commented 3 years ago

Maybe the idea is that the mRNA connects all the ribosomes up into one humongous complex?

That seems a bit weird. They are only connected by being on the same mRNA (like beads on a string). But they are still discrete complexes.

hattrill commented 3 years ago

I don't see why ribosome can't be both organell and ribonucleoprotein complex.

One major problem and why it's on our radar as an issue: Now we have gp2term relations, it can't be both - part_of a complex and located_in an organelle.

ValWood commented 3 years ago

Now we have gp2term relations, it can't be both - part_of a complex and located_in an organelle

But lots (most?) of complexes are located in organelles! Maybe part_of complex takes precedence over located_in organelle?

hattrill commented 3 years ago

There's only one slot for the gp2term rel. but I guess I would argue that the protein is part of a complex and the complex is located in the organelle.

deustp01 commented 3 years ago

It does look like an edge case. For that matter, if one had the patience to list all the components, a chromosome or a nucleolus could a complex (or a series of complexes for each, with transformations between them).

pgaudet commented 3 years ago

I put this to discuss with GO editors on out next call, Nov 15th.

ValWood commented 1 year ago

So was there a decision about whether we can add "ribonucleoprotein complex" as a parent to ribosome?

pgaudet commented 1 year ago

@ValWood Because of the Geneproduct2term rules, we cannot have both 'cellular anatomical entity' and 'protein-containing complex' as parents of the same term.

Another way to think about this may be, how do we want to annotate this? two possible statements:

Is one or the other better?

Thanks, Pascale

ValWood commented 1 year ago

Part_of ribosome sounds more correct. is_active_in ribosome sounds weird...but I don't know the implications

cmungall commented 4 months ago

IMO not having ribosome under organelle is odd. We also seem to be going on vibes here rather than strict criterion. Why not also move (for example) BMC?

How about aligning with ComplexPortal? We will never have a perfect definition of IMBO vs complex that is both actionable and aligns with biologist expectations but we can at least strive to be consistent with other databases.

I don't know their exact guidelines, but it looks like CPX includes subunits but not the whole ribosome (e.g https://www.ebi.ac.uk/complexportal/complex/CPX-5223 human 40S). So we could have ribosome is-a IMBO, and subunit isa (rnp) complex.

Note we should always have annotations at the level of the subunit so that helps with the bad vibes people get from using the cc relationships with ribosome

IMO ribosome and it's subclasses should all be do-not-annotate. All the direct annotations I see there are problematic. E.g.

deustp01 commented 4 months ago

if one had the patience to list all the components, a chromosome or a nucleolus could a complex

... and both, like the ribosome, would be intracellular non-membrane-bounded organelles. (I understand that this may not be a particularly helpful comment right now.)

ValWood commented 4 months ago

I don't know if this helps either, but I had a lot of banter with CHatGPT to try to tease out what constitutes a non-membrane-bound organelle.

philosophising with ChatGPT.txt

It seems that permanence is a criterion for ChatGPT (which seems a bit odd to me since the nucleus isn't permanent for organisms that do open mitosis, so the criteria for "organelle" is different for membrane-bound and non-membrane-bound). This almost seems to be a case of biological misappropriation of the useful term "organelle" and muddying the waters to mean "something we can see" (which is bad).

cmungall commented 4 months ago

I also tried ChatGPT. I found it basically reflects an inherent fuzziness here and you can manoever it into any position (IMO not so different than working with experts to make ontology definitions).

I got it to summarize key differences; these are inherently a bit fuzzy but I think this is reflective of the collective literature on this:

IMO this means you could place many structures in either or both so it's best just going with something that can be easily and consistently applied, and it's useful for pragmatic reasons (e.g. alignment with CPX), and we just enumerate edge cases as inclusion/exclusion lists.

ValWood commented 4 months ago

This seems to be one of the terms that was coined through microscopy originally (we can see it), which is a moving goalpost. None of the other criteria stacks up either: The cytoskeleton is currently an organelle but is inherently dynamic. The ribosome is currently an organelle has a specific task (many smaller complexes have more diverse functions).

In this case, why do we need non-membrane bound organelle as a grouping term? Is it a useful superclass? I have never found it a useful (for the reasons above). None of the criteria are clear. They could all just be "anatomical structures" (Intracellular or extracellular). That make the problem go away otherwise we will. be stuck in a lop forever...

(I don't understand the alignment with CPX problem though)

vanaukenk commented 4 months ago

Out of curiosity, I just took a look at how MeSH handles these concepts. Although they probably don't have it 100% right either, it might be worth looking at MeSH together on an ontology call to see if that can help us better formulate our distinctions in GO. (Although I suspect this might have been done early on in the creation of the CC branch, but don't know for sure.)

MeSH has the concept of a cellular 'structure' that seems to be in between an organelle, e.g. nucleus or Golgi, and smaller (both in size and number of members) protein-containing complexes such as we have in GO, e.g. transcription factor AP-1 complex.

In GO, perhaps MFs could 'occur in' organelles and structures, but not protein-containing complexes.

And FWIW, MeSH considers the ribosome an organelle.

pgaudet commented 4 months ago

One very practical way to assess this is the reason why this came up: in GO-CAM, which statement do we want to make for a CC:

for the chromatin I cannot think of a MF it would enable, so that seems OK to classify it as a cellular anatomical entity

for the ribosome: which statement do we want to make:

For me, occurs_in ribosome sounds odd.

This is what the draft guidelines here aim to address:

Thanks, Pascale

cmungall commented 4 months ago

Do we want to make statements about the ribosome or statements about the subunits? See my proposal above

I'm still confused about the wording of the proposed def. Is it function in the broader sense (e.g. translation) or molecular function only?

hattrill commented 3 months ago

I think that @pgaudet MF def works best for me when trying to distinguish these large structures. We have sometimes thought in terms of 'molecular machines' in GO - a large complex such as a ribosome or F1F0 ATPase is a molecular machine - the components work together to achieve an MF.

Whereas, an organelle is a compartment which provides an a separate 'environment' in which processes occurs.

It may be fuzzy in some areas, but thinking in terms of a GO-CAM makes it easier. I would rather have a ribosome as a complex that does something than a compartment where some things happen.

vanaukenk commented 3 months ago

I share @cmungall 's concern about the definition.

The def refers to 'biological function' and 'molecular function' and also references a GO BP, translation, as an example of something a complex 'enables' which is, in fact, our gp2term relation for a gene/gene product(s) to a GO MF.

"A protein-containing complex should enable a single biological function. For example, the ribosome is a complex because it enables translation (addition of amino acids to a polypeptide chain), even though different subunits may have different specific molecular functions."

From this it seems to me that what we're trying to say is that the subunits of a protein-containing complex may enable different MFs, but those MFs work together to achieve (i.e. are part of) one main GO BP, although I imagine there are cases where other GO BPs are part of that main BP.

ValWood commented 4 weeks ago

Can we discuss this on the next editors call?