geneontology / go-ontology

Source ontology files for the Gene Ontology
http://geneontology.org/page/download-ontology
Creative Commons Attribution 4.0 International
220 stars 40 forks source link

rename: DNA topoisomerase II complex term #15575

Closed dfrank9 closed 5 years ago

dfrank9 commented 6 years ago
ukemi commented 6 years ago

I think what needs to be done in this case is a tidying up of the definition of the existing type IV term. The definition makes it sound like it is restricted to bacteria, but I think the enzymatic activity is the same. If we do need to make a new term, we can't use the same name.

krchristie commented 6 years ago

The paper cited specifically indicates that the topoisomerase being characterized is a type II DNA topoisomerase, not a type IV DNA topoisomerase.

Liu et al. (2) and Stetler et al. (3) have made the important finding that certain of the DNA-delay genes of bacteriophage T4 code for a new type II DNA topoisomerase, confirming a prediction of McCarthy (4) that these gene products form an enzyme with gyrase-like activity. This DNA topoisomerase is ATP-dependent and relaxes either negative or positive supercoils. T4 DNA topoisomerase activity in extracts was shown to be dependent on the functioning of three T4 DNA-delay genes, 39, 52, and 60 (2). However, while p39 and p52 were shown to be part of the purified enzyme complex (2, 3), the gene 60 product was not clearly identified nor was it unequivocally shown to be part of the complex.

We have MF terms for type II DNA topoisomerases, but I don't see a CC term.

dfrank9 commented 6 years ago

NTR: DNA topoisomerase II complex

(Thank you for pointing out the different type of topoisomerase, I missed that)

bmeldal commented 6 years ago

Parentage: is_a GO:1902494 catalytic complex and capable_of GO:0061505 DNA topoisomerase II activity

We are missing a grouping term for DNA topoisomerase complex. I noticed that DNA topoisomerase activity is split into ATP-hydrolysing and ATP-independent activities so the grouping term would have to be ATP-agnostic.

Birgit

krchristie commented 6 years ago

Thanks for the input @bmeldal It's very helpful!

Regarding complex terms, this ticket now has requests for these two new terms:

and there are already two existing DNA topo complex terms:

but considering the existing terms in MF, I'm not sure where a new term for 'DNA topoisomerase II complex' should be placed relative to the two existing CC terms, or if it's even appropriate to have a term for a 'DNA topoisomerase II complex', when there are two different types of DNA topoisomerase II activities.

Here's what we have in MF for DNA topo activities:

20180418-dnatopocomplexes

bmeldal commented 6 years ago

I've not worked on Topos myself so can't tell you where the functional separation has to be made, by the numeral distinction or whether they are ATP-hydrolyzing or ATP-independent.

krchristie commented 5 years ago

I've been looking into this again. Here is a really nice concise summary of Type II topoisomerases:

Bax BD, Murshudov G, Maxwell A, Germe T. DNA Topoisomerase Inhibitors: Trapping a DNA-Cleaving Machine in Motion. J Mol Biol. 2019 Jul 10. pii: S0022-2836(19)30432-2. doi: 10.1016/j.jmb.2019.07.008. Review. PMID:31301408.

Type II DNA topoisomerases are divided into sub-types, II A and IIB [1, 5] based on structural and evolutionary considerations. Type IIA are found in bacteria and eukaryotes, whereas IIB were discovered in archaea and more recently in plants and plasmodial parasites. Most bacteria have two type IIA topoisomerases, DNA gyrase and topoisomerase IV. DNA gyrase consists of two copies of GyrA and two copies of GyrB and functions as an A2B2 heterotetramer (Fig. 1). Topoisomerase IV has two homologous subunits, ParC and ParE, and also functions as a heterotetramer. DNA gyrase can uniquely introduce negative supercoils into DNA, while topoisomerase IV performs strand passage with two different double-stranded DNA segments and has both decatenation and relaxation activity. Eukaryotic type IIA topoisomerases are encoded as a single protein, with regions equivalent to GyrB and GyrA at the N-and C-terminus of a single subunit (Fig. 1). Residues at the DNA cleavage catalytic center are conserved between the eukaryotic and prokaryotic type IIA topoisomerases.

The table from https://en.wikipedia.org/wiki/Topoisomerase#Classes is consistent except that I think that this table has an error where it labels the multimericity of the H. sapiens topo II-alpha and Topo II-beta as heterodimers instead of as homodimers (which both the above very recent paragraph from the review and the other wiki page on "DNA topoisomerases" [https://en.wikipedia.org/wiki/DNA_topoisomerase] say is the structure of eukaryotic type II topoisomerases.

Neither the review I quote or either of the two wiki pages on topoisomerases discuss bacteriophage type II topoisomerase. While the 1983 paper describing the purification of the bacteriophage type II topoisomerase (PMID: 6296073) talks about it being a trimer, it also admits that they aren't really sure that it is a trimer:

However, while p39 and p52 were shown to be part of the purified enzyme complex (2, 3), the gene 60 product was not clearly identified nor was it unequivocally shown to be part of the complex.

I looked for a recent paper discussing bacteriophage T4 topo II and found this:

Todd GC, Walter NG. Secondary structure of bacteriophage T4 gene 60 mRNA: implications for translational bypassing. RNA. 2013 May;19(5):685-700. doi: 10.1261/rna.037291.112. Epub 2013 Mar 14. PMID:23492219

Escherichia coli bacteriophage isolates in the T-even family contain a highly conserved type II DNA topoisomerase whose subunits are encoded by genes 52 and 39 (Hatfull 2008; Cresawn et al. 2011). In contrast, in bacteriophage T4 (T4) the conserved open reading frame (ORF) of gene 39 is disrupted by the insertion of an ∼1000-bp mobile DNA element. This insertion splits the N-terminal portion of the conserved gene 39 ORF from its C-terminal portion, which is encoded by the new T4 gene 60 (Fig. 1A; Hatfull 2008; Bonocora et al. 2011; Cresawn et al. 2011).

Thus, while it seems that it is technically true that phage T4 topo II contains three different gene products, it is still basically conserved with bacterial topo II's, which are normally heterotetramers.

krchristie commented 5 years ago

These are the three CC terms we currently have for topoisomerase complexes:

[Term] id: GO:0009330 name: DNA topoisomerase complex (ATP-hydrolyzing) namespace: cellular_component def: "Complex that possesses DNA topoisomerase (ATP-hydrolyzing) activity." [GOC:mah] comment: See also the molecular function term 'DNA topoisomerase (ATP-hydrolyzing) activity ; GO:0003918'. is_a: GO:0032991 ! protein-containing complex is_a: GO:0044424 ! intracellular part

[Term] id: GO:0009340 name: DNA topoisomerase IV complex namespace: cellular_component def: "A heterodimeric enzyme, which in most bacterial species is composed of two subunits, ParC and ParE. Functions in chromosome segregation and can relax supercoiled DNA." [GOC:jl, PMID:7783632] is_a: GO:0032991 ! protein-containing complex is_a: GO:0044444 ! cytoplasmic part

[Term] id: GO:0140225 name: DNA topoisomerase I-TDRD3 complex namespace: cellular_component def: "A protein complex that has DNA topoisomerase type I and RNA topoisomerase activities." [GOC:lnp, PMID:23912945, PMID:28176834] is_a: GO:0032991 ! protein-containing complex intersection_of: GO:0032991 ! protein-containing complex intersection_of: capable_of_part_of GO:0003697 ! single-stranded DNA binding intersection_of: capable_of_part_of GO:0003725 ! double-stranded RNA binding intersection_of: capable_of_part_of GO:0003727 ! single-stranded RNA binding intersection_of: capable_of_part_of GO:0003917 ! DNA topoisomerase type I activity intersection_of: capable_of_part_of GO:0006265 ! DNA topological change intersection_of: capable_of_part_of GO:0140226 ! RNA topoisomerase activity created_by: pg creation_date: 2018-05-29T10:05:33Z

It looks to me like this one: id: GO:0009330 - DNA topoisomerase complex (ATP-hydrolyzing) is meant to be a general term for ALL topoisomerase II complexes, since all topo II's are complexes, either heterotetramers (bacterial DNA gyrase, bacterial topo IV, and archael topo VI), or homodimers (eukaryotic topo II-alpha & topo II-beta) that hydrolyze ATP, and also ONLY topo II enzymes hydrolyze ATP.

However, we also have this term: GO:0009340 - DNA topoisomerase IV complex, which is a specific subtype of topo II.

krchristie commented 5 years ago

Checking annotations for this term: id: GO:0009330 name: DNA topoisomerase complex (ATP-hydrolyzing) namespace: cellular_component def: "Complex that possesses DNA topoisomerase (ATP-hydrolyzing) activity." [GOC:mah] comment: See also the molecular function term 'DNA topoisomerase (ATP-hydrolyzing) activity ; GO:0003918'.

Here's a summary of annotations per evidence code for each Panther family listed in the download from AmiGO:

GO-0009330-annotCounts

The 5 IMP annotations in PTHR:10290, which is a topoisomerase I enzyme are all for human TOP1 and I have suggested deleting them.

I also see that the term referred (GO:0003918) to in the comment for GO:0009330 DNA topoisomerase complex (ATP-hydrolyzing), has now been renamed as DNA topoisomerase type II (ATP-hydrolyzing) activity.

Thus, I think we can safely say that this term is supposed to be the complex term for topo II enzymes.

krchristie commented 5 years ago

Since the term GO:0009330 was originally intended to be the complex term for type II topoisomerase complexes, and has been used in that way with only a few errors, I propose that we update the existing term to be clearer by updating the name, definition, deleting the comment, and adding an equivalence axiom.

We should also add a relationship to indicate that DNA topoisomerase IV complex ; GO:0009340 is_a type of DNA topoisomerase II complex (ATP-hydrolyzing) ; GO:0009330

Update term name old name: DNA topoisomerase complex (ATP-hydrolyzing) => new name: DNA topoisomerase II complex (ATP-hydrolyzing) OR this instead: DNA topoisomerase II complex

Update definition def: "Complex that possesses DNA topoisomerase (ATP-hydrolyzing) activity." [GOC:mah] => def: Complex that possesses DNA topoisomerase II activity. Topoisomerase II activity is distinguished from topoisomerase I activity in that topoisomerase II cleaves both strands (rather than a single strand) of a double helix and requires ATP binding and hydrolysis. Bacterial topoisomerase IIA enzymes (DNA topoisomerase II, aka DNA gyrase, and DNA topisomerase IV) are typically A2B2 heterotetramers, though there are some exceptions such as in bacteriophage T4 where one of the subunits is split into two by the insertion of a mobile DNA element. Eukaryotic topoisomerase IIA enzymes (topoisomerases II-alpha and II-beta) are typically homodimers where each monomer contains the equivalent of the bacterial A and B subunits present on the same polypeptide. Type IIB topoisomerases are found primarily in Archaea and plants, but also in some bacteria. Type IIB enzymes typically have B subunits encompassing the ATP-binding site homologous to type IIA enzymes, but have non-homologous A subunits catalyzing DNA cleavage, though some variations exist in the subunit architecture. [PMID:31301408, PMID:24990376, PMID:23492219, GOC:krc, GOC:mah]

Delete stale comment & replace with equivalence axiom comment: See also the molecular function term 'DNA topoisomerase (ATP-hydrolyzing) activity ; GO:0003918'. => propose deleting this comment, which is now out of date with the current term name (DNA topoisomerase type II (ATP-hydrolyzing) activity) which has been updated to include the phrase type II since the comment was written. Using an equivalence axiom will keep this relationship up to date. => add equivalence axiom: 'protein-containing complex' and ('capable of' some 'DNA topoisomerase type I activity') and ('capable of part of' some 'DNA topological change')

Add relationship to this term id: GO:0009340 name: DNA topoisomerase IV complex => add: is_a "DNA topoisomerase II complex (ATP-hydrolyzing) ; GO:0009330"

@dfrank9 - Assuming we proceed with this course of action, this existing term (GO:0009330) would be an appropriate complex term for you to use. If you feel that you can determine from this paper whether they think it is a DNA gyrase (aka DNA topoisomerase II) or a DNA topoisomerase IV, then you could use a more specific term. We already have a more specific term for "DNA topoisomerase IV complex", but not for "DNA gyrase complex", but in the absence of a request to make a term for "DNA gyrase complex", I'm currently not planning to make one.

Since it's summer and lots of people are on holiday this week, and I'm out next week, I'm going to wait till people are back to check that no one has any objections.

bmeldal commented 5 years ago

Ok, that's making my head spin! types and names are inconsistent for historic reasons :(

First comment:

=> add equivalence axiom: 'protein-containing complex' and ('capable of' some 'DNA topoisomerase type I activity') and ('capable of part of' some 'DNA topological change')

shouldn't that read "and ('capable of' some 'DNA topoisomerase type II activity')" - i.e type II?

General comment on all 3 tickets that you linked/created: Whatever you decide I would consider: a) use the name(s) the user community know b) only create terms that are functionally different. e.g., according to your amended Wiki table in #17661 & #17667 all Type IA Topos have the same function so the complex term should really be just "Topoisomerase type IA complex" with all the species-specific terms being narrow synonyms (except for IIIbeta that has an unknown function and therefore is a headache!). That was the convention we agreed on so not to inflate the ontology too much.

Not sure if that's helpful but that's the best I can come up with now ;-) Birgit

krchristie commented 5 years ago

Ok, that's making my head spin! types and names are inconsistent for historic reasons :(

Yes, mine too! That's actually why I copied the data from the Wikipedia table and reformatted it so that I could see everything at a glance.

First comment:

=> add equivalence axiom: 'protein-containing complex' and ('capable of' some 'DNA topoisomerase type I activity') and ('capable of part of' some 'DNA topological change')

shouldn't that read "and ('capable of' some 'DNA topoisomerase type II activity')" - i.e type II?

Yes, good catch. I copy/pasted from the existing equivalence axiom from the topo !-TDRD3 complex term and forgot to edit that...:(

General comment on all 3 tickets that you linked/created: Whatever you decide I would consider: a) use the name(s) the user community know b) only create terms that are functionally different. e.g., according to your amended Wiki table in #17661 & #17667 all Type IA Topos have the same function so the complex term should really be just "Topoisomerase type IA complex" with all the species-specific terms being narrow synonyms (except for IIIbeta that has an unknown function and therefore is a headache!). That was the convention we agreed on so not to inflate the ontology too much.

I see how all Type IA topos having the same function is an argument for only have one MF term for type IA topoisomerase activity. But, I'm not sure that holds for the complexes since while the detectable MF, to remove (-) but not (+) supercoils is the same, the context is not. In addition, the majority of the type 1A topo enzymes are monomers, so the only existing complex term for any type 1A topo is the "DNA topoisomerase I-TDRD3 complex" term that needs to have its name and definition refined.

bmeldal commented 5 years ago

I think the problem with making all the specific Topo terms is that they will become composition-specific without any further differences in their relationships unless you can assign different processes to them.

We were trying to assess if we can define all complex terms using logical definitions. We are not there yet but making all these terms won't help. But I don't know the Topos field.

krchristie commented 5 years ago

@bmeldal - I am not currently proposing to make complex terms for all of the topoisomerases. My current proposal is to refine the name and definition of the existing term DNA topoisomerase complex (ATP-hydrolyzing) (GO:0009330) to make it clear that this is the general term for any type II topoisomerase. Following on from that, it is clear that a clean up of the MF terms for type II topoisomerase is needed as outlined here: https://github.com/geneontology/go-ontology/issues/17661

However, we already have a term for topoisomerase IV (in the type IIA subgroup), and I am not planning to propose obsoleting it. I think there is a case for making a term for DNA gyrase (aka topoisomerase II) since we already have a complex term for the other bacterial type IIA topo (topo IV) and there is a clear difference in what bacterial topo II (aka DNA gyrase) and topo IV do. However, I'm only going to do this if if is requested for annotation.

The only other complex term that we have for a topoisomerase is the one for DNA topoisomerase I-TDRD3 complex that I have suggested refinements for in #17667.

I will leave any future topoisomerase complex terms to be made as annotators find a need for them.

bmeldal commented 5 years ago

Ok, getting my head round these:

Type IIA:

We have already curated the 2 E. coli complexes (gyrase and Topo IV) and assigned "GO:0009330 DNA topoisomerase complex (ATP-hydrolyzing)". and "GO:0003918 DNA topoisomerase type II (ATP-hydrolyzing) activity" & "GO:0034335 DNA supercoiling activity" to both. (Once the MF terms are fixed I have to remove the parent-child duplicated annotations!) We must have missed the existence of the Topo IV term.

Action:

GO:0009330 DNA topoisomerase complex (ATP-hydrolyzing)

As the different Topo IIs have different activities it's hard to make a generic Topo II term, though? How are you going to define the generic term? It can't be logically defined as its activities vary depending on the subtype. I am wondering if this term needs obsoleting and GPs need to be reannotated to either NEW DNA topoisomerase II/gyrase or GO:0009340 DNA topoisomerase IV complex.

Type IA: (see also #17667)

We also have DNA topoisomerase I-TDRD3 complex in the CP. My gut feeling is that we can make the GO term a generic Type IA term linked to its MF "GO:0003917 DNA topoisomerase type I activity". There are many enzymes that can be annotated to the same GO function term while the function is carried out in different contexts. I wouldn't let that stop us. Anyway, there is (so far) only one type IA term anyway - until we request the reverse gyrase terms. Sandra is now curating all E.coli complexes.

I also found GO:0031422 RecQ helicase-Topo III complex:

We annotated it in CP for yeast. It's defined in GO as "A complex containing a RecQ family helicase and a topoisomerase III homologue; may also include one or more additional proteins; conserved from E. coli to human." and we annotated it with "GO:0003917 DNA topoisomerase type I activity". Is this effectively also a Type IA Topo?

Sorry, I'm trying to clear this up in my head as much as in the discussion!

krchristie commented 5 years ago

Ok, getting my head round these:

Type IIA:

  • Acc to your table, Topo II (bacs) and Topo IV (bacs) have different activities so justifies different complex terms with relationships to their respective (fixed) functions :)

Action:

  • We need a NEW term for Topo II / gyrase!

Regarding NEW term for DNA gyrase, would you please create a new ticket for this, and any other, new topoisomerase terms you are ready to request right now. It's useful for tracking purposes to separate things into different tickets, so it would be a big help. You can assign it to me immediately since it would be unkind to make any other GO editor get involved in the topoisomerase quagmire!

GO:0009330 DNA topoisomerase complex (ATP-hydrolyzing)

As the different Topo IIs have different activities it's hard to make a generic Topo II term, though? How are you going to define the generic term? It can't be logically defined as its activities vary depending on the subtype.

Regarding MF terms for topoisomerases, I think that it is reasonable to have top level grouping MF terms for topo type I versus topo type II since there are some very distinct differences between these, as defined by whether they cleave a single strand (type I) or both strands (type II), and all type II's are ATP-dependent. For bacterial topo IIA's, DNA gyrase (aka topo II) versus topo IV, it seems relatively easy to define distinct MFs, though we need to make sure that the type I versus type II mechanism makes it into these defs, especially for topo IV since this type II topo, like many type I topos, relaxes (-) supercoils, so the MF is only distinct if it includes the type I vs type II (SS vs DS cut) mechanism as well as the effect on supercoiling.

It is less obvious to me how to define MFs for vertebrate III-alpha versus III-beta (type IA's) or for vertebrate II-alpha versus II-beta (type IIA's), though I'm not opposed to adding more MF terms where we can clearly define the MF. However, I think we may need to think of the "Function" column from the Wikipedia derived table of topo's as a combination of MF and BP as GO defines them, rather than pure MF.

I am wondering if this term needs obsoleting and GPs need to be reannotated to either NEW DNA topoisomerase II/gyrase or GO:0009340 DNA topoisomerase IV complex.

There are also annotations for mammalian topoisomerase genes as well as for bacterial genes, so at the moment I think that we should NOT obsolete this term since it is appropriate for mammalian topo II genes. As I said above, it isn't clear to me how to define MF terms for vertebrate topo-II-alpha versus topo-II-beta, so it seems better to leave these annotated to the general topo II MF term. I have already requested that the annotations for human TOP1 to this term be removed since I think they are an error.

Type IA: (see also #17667)

We also have DNA topoisomerase I-TDRD3 complex in the CP. My gut feeling is that we can make the GO term a generic Type IA term linked to its MF "GO:0003917 DNA topoisomerase type I activity". There are many enzymes that can be annotated to the same GO function term while the function is carried out in different contexts. I wouldn't let that stop us.

The way this is named with TDRD3 as part of the name seems to imply that this was intended to be specific to that complex, but I didn't see any annotations to this term at all in GO when I looked in AmiGO the other day.

I also found GO:0031422 RecQ helicase-Topo III complex:

We annotated it in CP for yeast. It's defined in GO as "A complex containing a RecQ family helicase and a topoisomerase III homologue; may also include one or more additional proteins; conserved from E. coli to human." and we annotated it with "GO:0003917 DNA topoisomerase type I activity". Is this effectively also a Type IA Topo?

Would you please make a new ticket (again, you can assign it to me) to edit the name and/or synonyms of GO:0031422 RecQ helicase-Topo III complex: so that it can be found by searching for "topoisomerase". This seems like a reasonably generic class for a helicase-topoisomerase complex. My reading suggests that there are quite a number of these, but I haven't taken enough notes on these to have these sorted out in my head.

Also, I am wondering if TDRD3 is a helicase?, in which case the existing "DNA topoisomerase I-TDRD3 complex" term would be a subtype of "RecQ helicase-Topo III complex"

Anyway, there is (so far) only one type IA term anyway - until we request the reverse gyrase terms. Sandra is now curating all E.coli complexes.

When you are ready, just make a new ticket to request any reverse gyrase terms. Since they are heterodimers, CC terms could be useful

Sorry, I'm trying to clear this up in my head as much as in the discussion!

@bmeldal - Don't be sorry! I have to have the reformatted topoisomerase table in view every time I work on this ticket or I make mistakes, one of which you caught! I very much appreciate that you're getting involved with this issue to help sort out what is useful to you for annotation.

I'm on vacation from August 3-10, so I probably won't do anything on this until after I'm back.

krchristie commented 5 years ago

Thinking more about the best name for this term, and also thinking about the two existing MF terms for the type I and type II topoisomerase activities (which I've just suggested renaming in a more consistent and systematic way (https://github.com/geneontology/go-ontology/issues/17734)

I am updating my suggested new name for this term to match my proposal from the MF terms: Update term name old name: DNA topoisomerase complex (ATP-hydrolyzing) => new name: DNA topoisomerase type II (double strand cut, ATP-hydrolyzing) complex

krchristie commented 5 years ago

Finishing up some details mentioned here: https://github.com/geneontology/go-ontology/issues/17736

Added parentage for two terms for subtypes of type II topo complexes:

DNA gyrase complex and DNA topoisomerase IV complex is_a GO:0009330 DNA topoisomerase II complex (double strand cut, ATP-hydrolyzing)

Added equivalence axiom for this term:

GO:0009330 DNA topoisomerase II complex (double strand cut, ATP-hydrolyzing) is_a protein-containing complex capable_of GO:0003918 DNA topoisomerase type II (ATP-hydrolyzing) activity for now. I think there is nothing wrong with those statements.

clarification of MF term 'DNA supercoiling activity'

Your comment makes me think that we should modify the name of the existing "DNA supercoiling term' to make it clear that it is specific to negative ones

  • I checked existing annotations and ALL were gyrA or gyrB subunits, so this has only been used as expected for DNA gyrase subunits

leave the DNA gyrase activity synonyms for this term to help draw the connection between this activity and DNA gyrase enzymes, but change them to RELATED, or maybe to NARROW.

@bmeldal - let me know if you notice anything I missed

bmeldal commented 5 years ago

Yes, I think that's it - for now ;-)