geneontology / go-ontology

Source ontology files for the Gene Ontology
http://geneontology.org/page/download-ontology
Creative Commons Attribution 4.0 International
223 stars 40 forks source link

macromolecular complex & protein complex #12620

Closed dosumis closed 8 years ago

dosumis commented 8 years ago

We currently have

image

Protein complex is defined so that it does not encompass ribonucleoprotein complexes:

"A stable macromolecular complex composed (only) of two or more polypeptide subunits along with any covalently attached molecules (such as lipid anchors or oligosaccharide) or non-protein prosthetic groups (such as nucleotides or metal ions). Prosthetic group in this context refers to a tightly bound cofactor. The component polypeptide subunits may be identical."

(Although perhaps this can be made clearer).

This causes problems for functional classification (which have logical definitions with the genus 'protein complex') For example, many enzymes are ribonucleoprotein complexes. These currently have incomplete functional classifications (see https://github.com/geneontology/go-ontology/issues/12574#issuecomment-242407081 )

@ValWood objects to the current structure:

many people use the "protein complex" term, and would expect that to retrieve complexes like the ribosome and the spliceosome and telomerase (I suspect)

Is it possible to define a protein complex as a complex which has only proteins, or protein and RNA components?

so protein complex --ribonucleoprotein complex

Would that be crazy? then everything can go under protein complex, unless we know that it has an RNA component, then it moves down...

This way people will retrieve all protein complexes with the protein complex term.

Similarly protein-DNA complex (telosome), which is currently not retrieved by a "protein complex" search. I doubt there are any biologists who would not describe the telosome as a 'protein complex' https://en.wikipedia.org/wiki/Shelterin

but you would not currently retrieve it with a protein complex search...

The editors recently discussed this issue. We weren't happy to merge protein complex and macromolecular complex, but came up with the following proposal:

A slightly more radical proposal would be to change the primary name of 'macromolecular complex' to protein-containing complex. (DOS: I'd be happy with this).

Note: for consistency of functional classification, we need to move the genus for all functionally defined complex classes to be 'macromolecular complex'.

If no comments, we'll just go ahead with this fix.

CC @bmeldal @ValWood @hdrabkin @mcourtot @paolaroncaglia

ValWood commented 8 years ago

OK that would work too!

paolaroncaglia commented 8 years ago

Thanks @dosumis . Personally, I'd rather not change the primary name of 'macromolecular complex' to 'protein-containing complex', because of the potential ambiguity (at first sight) between 'protein complex' and 'protein-containing complex'. I think it would be clearer and easier for curators and users alike if we went for your less radical proposal, and use 'protein-containing complex' as a synonym. Everything else is fine with me!

bmeldal commented 8 years ago

I can almost live with that.

2 points:

1) From the def of protein complex

...or non-protein prosthetic groups (such as nucleotides or metal ions)...

We added 'nucleotides' to include DNAs and RNAs or does that only mean single nucleotides like ATP in GO-speak?

2) from suggested def comment for protein complex

protein complex: add user friendly comment along the lines of “these are complexes containing only proteins, if you are looking for a more general term please consider using the parent ‘macromolecular complex’”

I'm pretty sure the 'only proteins' doesn't hold true for quite a few of the leaf nodes, e.g. GO:1990060 maltose transport complex http://www.ebi.ac.uk/QuickGO/GTerm?id=GO:1990060 The def does not mention the maltose but the complex MUST contain maltose as the maltose-binding protein only binds to the core complexes when bound to maltose. Would this need to be moved to macromolecular complex? We would never be able to catch all and move them.

I agree with Paola, we shouldn't re-name the primary name for macromolecular complex.

dosumis commented 8 years ago

1) From the def of protein complex

...or non-protein _prosthetic groups_ (such as nucleotides or metal ions)...

We added 'nucleotides' to include DNAs and RNAs or does that only mean single nucleotides like ATP in GO-speak?

"A cofactor that is tightly or even covalently bound is termed a prosthetic group" https://en.wikipedia.org/wiki/Cofactor_(biochemistry).

bmeldal commented 8 years ago

Ok, thanks. I guess you don't want to relax the def any further then...

hdrabkin commented 8 years ago

We did that for nucleotides (like FAD, NAD) not polynucleotide chains (RNA, DNA)

Ha From: Chris Mungall notifications@github.com<mailto:notifications@github.com> Reply-To: geneontology/go-ontology reply@reply.github.com<mailto:reply@reply.github.com> Date: Friday, September 2, 2016 at 7:17 AM To: geneontology/go-ontology go-ontology@noreply.github.com<mailto:go-ontology@noreply.github.com> Cc: Harold Drabkin Harold.Drabkin@jax.org<mailto:Harold.Drabkin@jax.org>, Mention mention@noreply.github.com<mailto:mention@noreply.github.com> Subject: Re: [geneontology/go-ontology] macromolecular complex & protein complex (#12620)

1) From the def of protein complex

...or non-protein prosthetic groups (such as nucleotides or metal ions)...

We added 'nucleotides' to include DNAs and RNAs or does that only mean single nucleotides like ATP in GO-speak?

"A cofactor that is tightly or even covalently bound is termed a prosthetic group" https://en.wikipedia.org/wiki/Cofactor_(biochemistry).

You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHubhttps://github.com/geneontology/go-ontology/issues/12620#issuecomment-244349110, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AJ9NkC6vK2MURCYbMh-_YyFS2ALepxVKks5qmAXLgaJpZM4JuOud.

The information in this email, including attachments, may be confidential and is intended solely for the addressee(s). If you believe you received this email by mistake, please notify the sender by return email as soon as possible.

dosumis commented 8 years ago

I'm pretty sure the 'only proteins' doesn't hold true for quite a few of the leaf nodes, e.g. GO:1990060 maltose transport complex http://www.ebi.ac.uk/QuickGO/GTerm?id=GO:1990060 The def does not mention the maltose but the complex MUST contain maltose as the maltose-binding protein only binds to the core complexes when bound to maltose. Would this need to be moved to macromolecular complex? We would never be able to catch all and move them.

It sounds to me like a borderline case we could safely fudge & keep under 'protein complex'. It's a small molecule tightly bound, even if not strictly a cofactor.

bmeldal commented 8 years ago

Thanks, David. I will keep 'fudging' such cases when they appear ;-)

dosumis commented 8 years ago

label "macromolecular complex" id "GO:0032991" definition "A stable assembly of two or more macromolecules, i.e. proteins, nucleic acids, carbohydrates or lipids, in which at least one component is a protein and the constituent parts function together." has_exact_synonym "protein containing complex"

dosumis commented 8 years ago

Next step - change all

intersection_of: GO:0043234 ! protein complex intersection_of: capable_of ... -> intersection_of: GO: 0032991 ! macromolecular complex intersection_of: capable_of ...

This looks a little odd for some cases, but allows for classification to work fine if some complexes with the specified activity are ribonucleoprotein complexes.

dosumis commented 8 years ago

Done. Will commit later today or early tomorrow.

dosumis commented 8 years ago

This, of course, means that the hierarchy under 'protein complex' is quite flat.

image

But under macromolecular complex is now much deeper, with many new direct subclasses:

image

dosumis commented 8 years ago

But the classifications that cross between protein/ribonucleoprotein complex should now be fixed. e.g.

image

Compared to:

image

dosumis commented 8 years ago

@bmeldal Change is done, but had cold feet about committing without getting nod from you. Please look back at the last four comments and let me know if OK to go ahead and commit.

bmeldal commented 8 years ago

David, apologies for the delay, I was at the Genome Informatics Conference and forgot to reply earlier. I think we should commit it. If it ends up looking odd or unusable we can always revert. The 'protein complex ' hierarchy was always quite flat...

@ValWood can you please keep an eye on this and let us know if it looses expected or throws up unexpected relationships? Thanks.

dosumis commented 8 years ago

OK. Thanks. Will commit.

ValWood commented 8 years ago

Will do. Val