geneontology / go-ontology

Source ontology files for the Gene Ontology
http://geneontology.org/page/download-ontology
Creative Commons Attribution 4.0 International
222 stars 40 forks source link

New general classes needed under GO:0043190 ! ATP-binding cassette (ABC) transporter complex #10638

Closed gocentral closed 7 years ago

gocentral commented 11 years ago
  1. Defining ATP-binding cassette (ABC) transporter complex def: "A complex for the transport of metabolites into and out of the cell, typically comprised of four domains; two membrane-associated domains and two ATP-binding domains at the intracellular face of the membrane, that form a central pore through the plasma membrane. Each of the four core domains may be encoded as a separate polypeptide or the domains can be fused in any one of a number of ways into multidomain polypeptides. In Bacteria and Archaebacteria, ABC transporters also include substrate binding proteins to bind substrate external to the cytoplasm and deliver it to the transporter." intersection_of: GO:0043234 ! protein complex intersection_of: part_of GO:0005886 ! plasma membrane intersection_of: capable_of GO:0042626 ! ATPase activity, coupled to transmembrane movement of substances

(XP def recently added by me)

BUT - what this formal definition lacks is any reference to the protein familiy that defines ATP-binding cassette transporters. See: http://en.wikipedia.org/wiki/ATP-binding_cassette_transporter "Proteins are classified as ABC transporters based on the sequence and organization of their ATP-binding cassette (ABC) domain(s)"

Can/should we be adding differentia that reference the protein families that components belong to? intersection_of: has_part ??? ! ABC cassete domain protein # Not sure where this comes from.

In this case, that seems the honest thing to do.

OTOH, I've always thought GO's role was to provide a level of abstraction above the level of sequence similarity. Shouldn't we at least have a general class something like this:

name: ATPase transmembrane transporter complex intersection_of: GO:0043234 ! protein complex intersection_of: part_of GO:0005886 ! plasma membrane intersection_of: capable_of GO:0042626 ! ATPase activity, coupled to transmembrane movement of substances

ATP-binding cassette (ABC) transporter complex intersection_of: GO:0043234 ! protein complex intersection_of: has_part ??? ! ABC cassete domain protein intersection_of: part_of GO:0005886 ! plasma membrane intersection_of: capable_of GO:0042626 ! ATPase activity, coupled to transmembrane movement of substances

Any comments?

  1. Defining SubClasses of "ATP-binding cassette (ABC) transporter complex"

At least some of the subclasses of ATP-binding cassette (ABC) transporter complex specify a very specific composition:

name: enzyme IIA-maltose transporter complex (GO:1990154) def: "A protein complex consisting of the pentameric maltose transporter complex bound to two enzyme IIA (EIIA) molecules. EIIA is a component of the glucose-specific phosphotransferase system that inhibits maltose transport from the periplasm to the cytoplasm. When EIIA-bound, the maltose transporter remains in the open, inward-facing conformation, which prevents binding of maltose-loaded maltose binding protein (MBP) to the transporter."

Is it really sustainable in the long-term to have such specific complex terms? It strikes me that they are without end. Would they ultimately be better be handled as annotations? (there was some talk of annotating complexes at the recent meeting).

Most of the subclasses have definitions with a complicated list of differentia including what is transported, what type of membrane it is transported across and in some cases include taxon constraints (e.g. "in gram negative bacteria). Components are not always specified, but even when they are not, they are sometimes implied by the name. e.g.

id: GO:1990193 name: BtuCD complex namespace: cellular_component def: "Protein complex involved in cobalamin (vitamin B12) transport through the plasma membrane. In E. coli, the complex is a tetramer and consists of the cytoplasmic ATPase BtuD homodimer together with the transmembrane BtuC homodimer." [GOC:bhm, PMID:22569249]

(Note - the components seem to be specified in this def in order to give an example rather than as differentia).

In the absence of ways to capture all these differentia, it is much safer to just add necessary conditions for class membership rather than specifying overly broad necessary and sufficient conditions for class membership. This is what I've done for these terms so far, giving autoclassification:

http://viewvc.geneontology.org/viewvc/GO-SVN/trunk/ontology/editors/gene_ontology_xp_write.obo?r1=12428&r2=12430

(see attached screencap)

Note that the I've deleted a couple of cases of overly broad XPs:

id: GO:1990193 ! BtuCD complex intersection_of: GO:0043234 ! protein complex
intersection_of: capable_of_part_of GO:0015889 ! cobalamin transport

id: GO:1990060 ! maltose transport complex intersection_of: GO:0043234 ! protein complex
intersection_of: capable_of_part_of GO:0015768 ! maltose transport

All of this leads me to some of very general points:

  1. I think we need some general guidance on adding XPs that urges caution, encouraging editors to simple add relationships rather than intersections unless they're really confident they can capture a complete set of necessary and sufficient conditions.
  2. Should we start capturing design patterns that editors can refer to with the aim of getting consistency in autoclassification? Is so, we need to discuss where and how. It might be useful to have a way to attach patterns to key classes that are used for grouping via some specific pattern.
  3. Should we have a policy of adding general classes for protein complexes that do not reference specific components in order to give users useful intermediate level classifications. e.g. in this case, we could have a set of classes defined by only:

location of complex (e.g. plasma membrane) what is transported (e.g. maltose) mechanism of transport (e.g. ATPase dependent)

Comments Please.

Reported by: dosumis

Original Ticket: geneontology/ontology-requests/10443

gocentral commented 11 years ago

BTW - wondering if such a discursive ticket should really live on JIRA, rather than here.

Original comment by: dosumis

gocentral commented 11 years ago

Following assertion of classification, I've noticed two classifications that make it clear that defining "ATP-binding cassette (ABC) transporter complex" without reference to protein families is dangerous:

"sodium:potassium-exchanging ATPase complex" and "hydrogen:potassium-exchanging ATPase complex"

are NOT ABC cassette transporters, but Major facilitator superfamily domain transporters. See this InterPro page for details.

For now I will make a new general class:

ATPase dependent transmembrane transporter complex intersection_of: GO:0043234 ! protein complex intersection_of: part_of GO:0005886 ! plasma membrane intersection_of: capable_of GO:0042626 ! ATPase activity, coupled to transmembrane movement of substances

I will weaken the assertions on 'ATPase dependent transmembrane transporter complex' to specify only necessary conditions for class membership. For now I will manually classify ABC complexes, but I think it is worth discussing whether this could be automated via formal references to protein families.

Original comment by: dosumis

gocentral commented 10 years ago

Original comment by: dosumis

dosumis commented 7 years ago

Basic issues fixed. General issues mentioned are part of long-term ongoing discussion of how to model protein complexes in the GO.