geneontology / go-ontology

Source ontology files for the Gene Ontology
http://geneontology.org/page/download-ontology
Creative Commons Attribution 4.0 International
223 stars 40 forks source link

Change definitions or new terms for children of GO:0019783 #5554

Closed gocentral closed 9 years ago

gocentral commented 15 years ago

Several small proteins such as UB, SUMO, and Rub/NEDD need to proteolytically processed before they can be attached to a target.

This requires the cleavage of a peptide bond within UB, SUMO, etc. (process 1) This processing is done by a series of enzymes that generally seem to be cysteine proteases.

Once these small proteins are attached to a target protein via an isopeptide bond, they can be removed from the target by an isopeptidase activity. (process 2) This process is often carried out by the same enzymes that do the peptide cleavage described in process 1.

Please see the following paper for one example: http://www.plantphysiol.org/cgi/content/full/142/1/318

There are a series of terms that are children of small conjugating protein-specific protease activity (GO:0019783) such as:

SUMO-specific protease activity (GO:0016929)that sound like they could apply to either of these processes. However, the definition states:

Catalysis of the hydrolysis of SUMO, a small ubiquitin-related modifier, from previously sumoylated substrates.

This only relates to process 2. If I want to annotate a protein that has only been shown to act in process 1, then I do not know what molecular activity to assign to it, other than the general term "cysteine-type peptidase activity."

I am not sure if it would be possible to either a) amend the definition to include both process 1 and 2, or b) to create a new term. In some senses I lean toward the former because I think that many enzymes may be involved in process 1 or 2. But, I would be grateful for feedback from the GO curators.

Please note that the definition for "small conjugating protein-specific protease activity" also only applies to process 2: Catalysis of the hydrolysis of various forms of polymeric ubiquitin or ubiquitin-like sequences (e.g. APG8, ISG15, NEDD8, SUMO). Will remove ubiquitin-like sequences from larger leaving groups.

Sincerely, kate

Reported by: kadreher

Original Ticket: "geneontology/ontology-requests/5572":https://sourceforge.net/p/geneontology/ontology-requests/5572

gocentral commented 15 years ago

Here's some text from a discussion of an article that discusses the two different activities (SUMO endopeptidase vs. SUMO deconjugase) a little more explicitly. http://www.plantcell.org/cgi/rapidpdf/tpc.108.058669v1

In Arabidopsis, eight genes have been identified that encode proteins with high similarity to yeast Ulp1 (Kurepa et al., 2003; Murtas et al., 2003; Chosed et al., 2006; Colby et al., 2006). So far, At ULP1a, At ULP1c (OTS2), At ULP1d (OTS1), and ESD4 have been shown to have At SUMO1/2 endopeptidase activity in vitro (Chosed et al., 2006; Colby et al., 2006) (Figure 1). Despite OTS1 having SUMO1 endopeptidase activity in vitro, its role in vivo is largely directed toward the deconjugation of SUMOylated proteins. ThisSUMO deconjugating role also applies to OTS2, as shown by the increased levels of SUMOylated protein in the corresponding mutant backgrounds (Figure 4), and to ESD4 (Murtas et al., 2003). Since the major roles of OTS1, OTS2, and ESD4 seem to be mainly deconjugative, the question arises of which of these is the major SUMO maturation protease in Arabidopsis. One possibility is that the SUMO endopeptidase activity could be highly redundant and that most (if not all) Arabidopsis SUMO proteases contribute to some extent to proSUMO maturation. An alternative model could be that one or more Arabidopsis SUMO proteases is involved in proSUMO to SUMO maturation. This model applies in yeast, where only two SUMO proteases, Ulp1 and Ulp2, exist. Although Ulp2 has endopeptidase activity in vitro, Ulp1 is largely responsible for proSUMO maturation in vivo (Li and Hochstrasser, 2000).

Original comment by: kadreher

gocentral commented 15 years ago

I'm looking at this item and SF 2245112 together. We recently completed a reorganization of the terms under peptidase activity (GO:0008233) in which we cleared out most of the terms that reflected substrate specificity, mainly because there are very few cases where that specificity can be cleanly defined. Details are available on the GO wiki at

http://wiki.geneontology.org/index.php/Proteases

At that time we didn't deal with GO:0019783 (small conjugating protein-specific protease activity) or its descendants, but it would be consistent with the general peptidase overhaul to change how we handle these activities. Instead of having a bunch of terms organized by which small modifier is cleaved, I suggest having just a few terms in this branch:

small conjugating protein-specific protease activity GO:0019783 --[i] small conjugating protein-specific endopeptidase activity GO:new --[i] small conjugating protein-specific isopeptidase activity GO:new

I would broaden the definition of GO:0019783 to cover the isopeptidase and The existing children of GO:0019783 would be merged into the parent (simply because the procedure is quicker and simpler than making them obsolete) and their names would become narrow-scope synonyms.

We could go even farther and not even keep GO:0019783, and it would be consistent with our earlier work on peptidase terms. In that case, any of the enzymes in question could be annotated to cysteine-type peptidase activity (GO:0008234), isopeptidase activity (GO:0070122), or both, as needed.

Do you have a preference?

cheers, m

Original comment by: mah11

gocentral commented 15 years ago

Dear Midori,

Thank you for your comments. But, I feel a little worried. I will feel very guilty if I am the cause of removing specific terms, such as "SUMO-specific protease activity" which are of great value to the community. I read the GO wiki file on the proteases, and I can appreciate the point about ill-defined specificity. But, I would also like to express my concerns about this. Clearly the decision has already been made and implemented, and I doubt that anything I say will be useful, but I worry about how this revamping of the peptidase terms will hamper the efforts of researchers to make comparisons of protein function and prediction of protein function across kingdoms.

One problem that I have encountered recently as a curator is that scientists who are working to define the biological function and substrate specificity of a protein often fail to discuss its catalytic mechanism. Of course, if you are reading a paper in JBC, this will probably be described in great detail. But, if you are reading a Plant Physiology paper, they often do not mention whether something is a cysteine or aspartate-type peptidase, etc. Of course, sometimes they do mention a particular residue that is part of the active site in their sequence alignment, but this is not always the case. In addition, when we do annotations to GO, it seems to me that the assays we cite should actually support the term we are annotating to. If someone shows that enzyme A cleaves protein X, then can I say that they have shown that enzyme A has "glutamic-type peptidase activity" just because I know it is an active enzyme and other members of its family have been shown to be glutamic-type peptidases? That seems like it deserves an IC more than an IDA. (However, I would feel comfortable giving it an IDA for "protein X peptidase activity.") Given this difficulty, I have been forced at times to annotate to the very generic parent term, e.g. endopeptidase activity. This leads to my second biggest concern.

How useful is it to the users to know that something has "endopeptidase activity" or "cysteine-type peptidase" activity when they are trying to figure out what their enzyme does? I come from a lab that was working on the ubiquitin pathway. If we were trying to map a mutant and we wanted to find a candidate gene, knowing that a gene was annotated with "small conjugating protein-specific isopeptidase activity" (or worse, exopeptidase activity) would be of much less use to us that knowing whether something cleaved SUMO, Ub, RUB, etc. Similarly, if we were doing a microarray experiment and looking to see whether specific genes had been up-regulated, we would definitely want a more specific term.

Also, how will this change affect bioinformaticians. I believe that the sequences of UB, SUMO, and RUB, etc. peptidases are distinct and can be readily separated in a phylogeny. But, if a bioinformatician studying enzymes with different substrate specificities wants to be able to get all of the enzymes that act on substance or class A, how can he or she do that if the most specific terms refer to metallocarboxypeptidase activity, etc.? Or, if someone wants to know whether a particular peptidase class has expanded or contracted in different species, they will often want to know this for a much narrower group of enzymes than all that have "aspartic peptidase activity." The beauty of GO is that they should be able to pull-out all the sequences from all kingdoms that have "SUMO endopeptidase activity" so that they can compare them, and I am afraid this ability will be lost under the new system.

One other question: it seems that in other domains, e.g. kinases, glycosyl transferases, hydrolases, etc., that the substrate is specifically included because this gives very valuable information to the user. Are these also in danger of being eliminated?

Anyway, I'm sorry for gushing a bit about this, but, I am concerned about how I will be able to navigate through this peptidase system as a curator, and how users will be able to glean any information from it that will be directly relevant to their research (unless they are biochemists who will be very interested in the mechanism of catalysis). I did not see any of these issues addressed in my skimming through the GO document. If there is a record of these concerns being raised, please let me know so that I can read them.

And, in specific answer to your question, if you are going to get rid of the SUMO, Ub, etc., terms, then please do keep the small conjugating protein-specific protease activity GO:0019783 term and its two children.

Thank you, kate

Original comment by: kadreher

gocentral commented 15 years ago

Ideally (though I know reality often falls short) the changes to the peptidase terms in GO should have very little effect on researchers' or bioinformaticians' ability to compare and predict peptidase substrate specificity, because that information is not really the purview of GO. GO simply can't accommodate all types of biochemical and molecular biological data, or be all things to all users; it has to have a defined scope and we GO curators have to work towards making the actual ontology content consistent with that scope.

Peptidase substrate specificity is properly the concern of peptidase databases, particularly MEROPS (merops.sanger.ac.uk), and PRO, the OBO protein ontology. It would be really nice if PRO were more mature, and I'm not sure how widely know MEROPS is, so I have plenty of sympathy for annotators and end users who have the added hassle of looking in different places for various kinds of information. But I honestly don't think that shoehorning things into GO that don't really belong there is not a good long-term solution to the problem of integrating data from different sources.

For kinases, hydrolases, etc. there actually has been a little bit of talk of reducing the amount of substrate specificity that GO tries to capture, but that has been a much lower priority, not least because the classification of those activities in GO, EC, and elsewhere isn't nearly as much of a mess as the peptidases were. For one thing, there are far more "specific" terms that actually represent breakage or formation of different kinds of bonds, whereas peptidases pretty much all catalyze peptide bond cleavage. The closest parallel to peptidases are the restriction endonucleases, where GO has no terms capturing recognition sequence (or cleavage site) specificity, and this is utterly non-controversial.

m

Original comment by: mah11

gocentral commented 15 years ago

Dear Midori,

Thank you very much for taking the time to address my concerns. I guess that I was wrong in my understanding of the "purview of GO." Now I feel more worried than ever about suggesting new terms. I was working under the assumption that GO wanted to define terms as precisely as possible to give the most information about protein MF, BP, and CC. Maybe I can talk to Tanya and Donghui about this so I can understand what GO actually is mandated to accomplish

But, back to the terms: Are you definitely going to get rid of the SUMO, Ub, etc. protease activities? For users, I think it would be nice to have SUMO, Ub, NEDD, etc. endopeptidase and isopeptidase activities as children of: --[i] small conjugating protein-specific endopeptidase activity GO:new --[i] small conjugating protein-specific isopeptidase activity GO:new

But, if this is not part of the GO mandate, then please use these new terms under the umbrella: small conjugating protein-specific protease activity GO:0019783. One other problem is that curators may have to go back to the genes that are already annotated as "SUMO-specific protease activity," etc, to determine which act as endopeptidases and which act as isopeptidases (or both - which will probably generally be the case).

Thank you for separating out these two different activities and for all of your other help.

-kate

Original comment by: kadreher

gocentral commented 15 years ago

Dear Kate,

To be honest, I was undecided about the Ub, SUMO, etc. terms at the time that I worked on the rest of the peptidase terms, and I'm still undecided now. On one hand, substrate specificity seems pretty cleanly defined for those enzymes, unlike many of the other peptidases for which we've obsoleted GO terms. On the other hand, I'm a bit worried that if we allow any substrate-specificity-based terms, it's harder to make a robust case for excluding others -- and then we might end up back where we were, with a lot of terms corresponding to gene products whose substrate specificity can't be so precisely defined.

I'm sorry you've borne the brunt of my uncertainty and overthinking ... now that I've got it off my chest, I guess I can sneak the more specific new terms in. There won't be that many, and it's likely that not many other curators will be all that concerned. Just be warned that if I add the specific terms and there are complaints, I might have to rethink, and do some merges later on.

More generally, please keep suggesting new terms, and don't worry too much about GO scope. We'll evaluate every request and let you know if we recommend any modifications to the proposal, or if a given term really wouldn't fit in. I would much rather receive all the suggestions, even horrible ones (not that I think you'd submit any), than have annotators self-censor.

m

Original comment by: mah11

gocentral commented 15 years ago

Hi Kate,

For the moment I've added these:

small conjugating protein-specific endopeptidase activity GO:0070137 small conjugating protein-specific isopeptidase activity GO:0070138 SUMO-specific endopeptidase activity GO:0070139 SUMO-specific isopeptidase activity GO:0070140

If (a) you need more terms for annotating and (b) the ones I've added so far don't cause an outcry, I can add others.

Midori

Original comment by: mah11

gocentral commented 15 years ago

Original comment by: mah11

gocentral commented 15 years ago

Dear Midori,

Thank you so much for all your help! I hope you know how much we appreciate the work that you GO curators do! I think that you often have to deal with the submissions from me, so I am especially grateful to you. I will continue to submit terms. And, thank you for kindly taking the time to help me understand some of the factors that must be considered when evaluating new (and old) terms.

Have a great weekend!

-kate

Original comment by: kadreher

gocentral commented 15 years ago

OK, there hasn't been any screaming, so I'll close this item.

m

Original comment by: mah11

gocentral commented 15 years ago

Original comment by: mah11

gocentral commented 15 years ago

Thanks, Midori!

Original comment by: kadreher