geneontology / go-ontology

Source ontology files for the Gene Ontology
http://geneontology.org/page/download-ontology
Creative Commons Attribution 4.0 International
220 stars 40 forks source link

NTR & issues with GO:0031428 "box C/D snoRNP complex" #11686

Closed gocentral closed 7 months ago

gocentral commented 9 years ago

Hi,

Today, I was looking at PMID:9418896 which describes cloning and characterization of human RRP9 (aka U3-55K). It specifically says that human RRP9 is part of the U3 snoRNP complex, but not part of the general box C/D snoRNP complexes which function as methylation guides for various RNAs. Note that U3 contains box C/D snoRNA type motifs, but is classed as being involved in pre-rRNA cleavage, not with methylation guide function, along thre other box C/D snoRNAs: U8, U14, and U22 (PMID:9159079).

So, it seems we need some additional terms. Currently, I need a "U3 snoRNP complex" specific term for sure. I don't know much about U8, U14, or U22 specifically to know whether they share proteins with each other or with U3, or if each has its own specific set.

However, note that the current term GO:0031428 "box C/D snoRNP complex" is defined in a way that seems to require the methylation guide activity:

So, we should consider carefully whether the existing term GO:0031428 should be the general term or the more specific term suggested by its definition. Note that currently it seems to be defined more consistently with the name of a more specific proposed term for "methylation guide box C/D snoRNP complex".

thanks,

-Karen

P.S. It's possible that it might be worth having a term specifically for "pre-rRNA cleavage box C/D snoRNP complex", and then having specific terms for snoRNPs with U8, U14, and U22, but as I said earlier, I don't know anything specific about those three. So, what I proposed above seems like the minimum I know right now.

Reported by: krchristie

Original Ticket: geneontology/ontology-requests/11516

gocentral commented 9 years ago

Original comment by: tberardini

ggeorghiou commented 6 years ago

Hi all,

I am reviewing old disputes in Protein2GO and this dispute is still open for us pending the addition of the GO term that Karen has requested. Is there any update on this?

Best, George

pgaudet commented 6 years ago

Hi, I thought the terms to be created were straight forward but I am not so sure anymore. How many children are we planning to create? For example in Table 3 of PMID:25879954 there are > 20 different ones; is this what we need?

@krchristie I assigned you since you created the ticket - hopefully that OK with you.

Thanks, Pascale

krchristie commented 6 years ago

@pgaudet @ggeorghiou - I'll take a look at this once my manuscript is resubmitted, so probably not till next week.

ggeorghiou commented 6 years ago

Thanks Karen! Best of luck with the resubmission!

krchristie commented 6 years ago

Thanks @ggeorghiou !

krchristie commented 6 years ago

@pgaudet - Table 3 of PMID:25879954 lists individual snoRNAs, not complexes. A large number of snoRNAs are the guide RNAs for rRNA or snRNAs, and we have previously decided that we do NOT need separate complexes for each individual guide complex, which differ only by which guide RNA is present. But we may need a little bit more granularity for the box C/D RNPs like we already have for the box H/ACA RNAs.

pgaudet commented 6 years ago

OK, thanks !

krchristie commented 6 years ago

Useful quote from: Pluk H, Soffner J, Lührmann R, van Venrooij WJ. cDNA cloning and characterization of the human U3 small nucleolar ribonucleoprotein complex-associated 55-kilodalton protein. Mol Cell Biol. 1998 Jan;18(1):488-98. PMID:9418896

snoRNPs can be divided into four groups, which appear to be functionally distinct (1, 46). Methylation guide snoRNPs direct the site-specific formation of 2′-O-methyl groups in mature rRNA. All snoRNAs of this class contain two conserved sequence elements, referred to as box C and box D (30), and contain an extended region (10 to 21 nucleotides) of base complementarity to mature rRNA (8, 24, 34, 49). Members of the second group of snoRNAs, which encompasses U3, U8, U14, and U22 snoRNAs, also contain the conserved box C and D elements and are involved in pre-rRNA processing reactions (reference 46 and references therein). All box C- and D-containing snoRNAs, including methylation guide snoRNAs, are associated with the conserved nucleolar protein fibrillarin, which thus is a common snoRNP component (30). Members of the third class of snoRNAs lack the box C and D elements but share another conserved sequence element, referred to as the ACA box (1). Such snoRNAs have been implicated in the site-specific synthesis of pseudouridine in rRNA (15, 33). The last group of snoRNAs consists of only one snoRNA, RNase MRP. RNase MRP is an endoribonuclease involved in the processing of pre-rRNA at site A3 in the internal transcribed spacer 1 (27).

krchristie commented 6 years ago

Note that the previous quote from Pluk et al. 1998 predates this very informative paper about box C/D and box H/ACA RNAs in Cajal bodies, referred to as scaRNAs:

Meier UT. RNA modification in Cajal bodies. RNA Biol. 2017 Jun 3;14(6):693-700. doi: 10.1080/15476286.2016.1249091. Epub 2016 Oct 24. PMID:27775477

krchristie commented 6 years ago

Looking at a number of papers, and also the existing structure of CC terms for the box H/ACA RNP complexes:

-- box H/ACA RNP complex --- box H/ACA scaRNP complex --- box H/ACA snoRNP complex --- box H/ACA telomerase RNP complex

I think that a similar structure should be implemented for the box C/D RNP complexes, though I'd actually like to have a term that is specific* to being a methylation guide, i.e. "box C/D methylation guide snoRNP complex". Having looked at all the papers that are the source of existing experimental annotations to this term, I think that the existing term "box C/D snoRNP complex" (GO:0031428) should become this specific term, as is consistent with its existing definition.

-- box C/D RNP complex (GO:new) --- box C/D scaRNP complex (GO:new) --- box C/D methylation guide snoRNP complex GO:0031428) --- box C/D U3 RNP complex (GO:new)


*It's possible that the 'box H/ACA snoRNP complex' (GO:0031429) term should become specific to the pseudouridyltion guide complexes, though the definition does not currently specify this.

krchristie commented 6 years ago

Since I'd like to change the name of the existing term GO:0031428, currently named 'box C/D snoRNP complex' to match the current definition and be specific to the methylation guide box C/D snoRNPs with the new term name 'box C/D methylation guide snoRNP complex', I wanted to check with @srengel and @ggeorghiou about this.

I've looked at all the papers used for experimental annotations. The majority are appropriate for the existing definition of the term that is specific to box C/D complexes with methylation guide activity. The papers that have large numbers of SGD annotations used multiple snoRNAs, so making the existing term specific to its current definition is correct for the majority of the annotations; there are only a few that would need to be changed. The papers would some additional annotations some of the new specific terms. There are a few existing annotations that might need to be changed because they don't match the existing defintion. However, changing the name of the existing term to match its definition, and creating a new term to be the more general term will be more conservative with respect to fewer annotations that need to be changed. Does this seem good @srengel & @ggeorghiou ?

# annotations - contributor 1 - CAFA 1 - GeneDB 1 - GO_Central 1 - RGD 63 - SGD 2 - UniProt

Reference|# annotations PMID:10094313|SGD_REF:S000064241 | 9 PMID:10733567|SGD_REF:S000043312 | 35 PMID:11081632|SGD_REF:S000064243 | 18 PMID:16908538|SGD_REF:S000136864 | 1

MGI:MGI:5882914|PMID:11842104 | 1

PMID:10679015|RGD:633515 | 2

PMID:11081632 | 1 PMID:17981991 | 1 PMID:9418896 | 1

krchristie commented 6 years ago

In looking at the box C/D RNP complexes, I found several interesting things:

  1. Falaleeva M, Welden JR, Duncan MJ, Stamm S. C/D-box snoRNAs form methylating and non-methylating ribonucleoprotein complexes: Old dogs show new tricks. Bioessays. 2017 Jun;39(6). doi: 10.1002/bies.201600264. Epub 2017 May 15. Review. PubMed PMID: 28505386 This paper which talks about how it is now known that box C/D "snoRNAs" have many roles in addition to methylation guides, and that these other non-methylating complexes have different protein compositions than the methylation guide complexes. It also mentions how a "few SNORDs including SNORD3@ and U8 and U13 direct pre-rRNA cleavage".

  2. Kuhn JF, Tran EJ, Maxwell ES. Archaeal ribosomal protein L7 is a functional homolog of the eukaryotic 15.5kD/Snu13p snoRNP core protein. Nucleic Acids Res. 2002 Feb 15;30(4):931-41. PMID: 11842104 This paper has indicated that there are homologs of eukaryotic box C/D small nucleolar RNAs (snoRNAs) in Archaea termed sRNAs. Archaeal homologs of the box C/D snoRNP core proteins fibrillarin, Nop56/58, and the 15.5kD snoRNP protein (aka Snu13) have also been identified.

  3. Meier UT. RNA modification in Cajal bodies. RNA Biol. 2017 Jun 3;14(6):693-700. doi: 10.1080/15476286.2016.1249091. Epub 2016 Oct 24. PubMed PMID: 27775477 This paper talks about the fact that there are RNAs of the "snoRNA" class that are found in Cajal bodies and direct methylation of spliceosomal RNAs (snRNAs), now referred to as scaRNAs.

To accomodate all of this, I need to alter my earlier proposal slightly:

-- box C/D RNP complex (GO:new) --- box C/D methylation guide RNP complex (GO:new) [for Archaea, euk of unknown location] ---- box C/D methylation guide scaRNP complex (GO:new) ---- box C/D methylation guide snoRNP complex (GO:0031428) --- box C/D pre-rRNA cleavage RNP complex (GO:new) [appropriate for U3, U8, U13 RNPs]

My previous comment that I think that making the existing term GO:0031428 specific to box C/D methylation guide snoRNP complexes is the most conservative option that will have the least impact on existing annotations is still true.

krchristie commented 6 years ago
srengel commented 6 years ago

i'm happy to defer to you @krchristie on this one, because you are the expert in this area. i imagine you probably did all the SGD annotations yourself.

since you have been looking at all this, would you be able to let me know which of our 63 annotations would need to be updated to a diff term? that would be super helpful.

krchristie commented 6 years ago

Thanks @srengel Yes, I did do almost all of the SGD ones, which is why I thought it made most sense for me to check all the annotations myself instead of request other groups to do it. I will definitely let you know which SGD annotations should be updated. I can do Protein2GO challenges, or tell me if you prefer some other way to let you know which annotations to update.

srengel commented 6 years ago

thanks @krchristie email or GitHub works for me. :)

ValWood commented 6 years ago

Hi Karen.

Me and @bmeldal are proposing to remove the grouping parent term in GO:0072588 box H/ACA RNP complex because it is grouping functionally unrelated complexes (which happen to have a common subclomplex). I also thought that was what we agreed to do at previous GO meetings

This seems to be different from what you propose here.

val

krchristie commented 6 years ago

I realize now that I got it the wrong way around, that this ticket is to make additional box C/D complex terms somewhat parallel to the H/ACA complex terms, sorry for any confusion about that.

Considering that one of the main defining features of the box C/D and box H/ACA complexes is structural, a grouping term seems reasonable in that sense.

I object to removing the grouping term for a number of reasons:

  1. In Archaea, there comparable complexes, but no specialization into nucleolar or Cajal-body types as those cellular structures do not exist, so this grouping term seems to be the only appropriate term for annotation.
  2. I need to make a complex term for a box H/ACA complex that is involved in rRNA processing cleavage reactions, but does not guide pseudouridylation. A similar situation exists for box C/D complexes, where there complexes that still bind to rRNA and are required for rRNA cleavages that are part of the processing of the rRNA transcript to mature rRNA species, but which are not methylation guides. These complexes involved in cleavage but not modification seem very closely related as there are some specific snoRNA complexes that control both a cleavage and a modification site.
  3. In the process of researching what needs to be done for box C/D complex terms, and also what had been previously done for the box H/ACA complex terms, I reread a bunch of the yeast papers (that I am usually the original annotator of). I have come to the conclusion that the experiments that allow you to say that any of these RNAs or proteins are part of complexes ONLY provide evidence that they are part of some box H/ACA (or box C/D) complex, but they do NOT provide evidence about what more specific type of complex they are. For ALL of the yeast papers I reread, the ability to say whether a complex is a box H/ACA pseudouridylation complex that targets rRNA is based on two, or three, pieces of evidence: -- presence in a complex that contains a box H/ACA snoRNA -- knowledge of which rRNA modification(s) is missing when that snoRNA is mutated -- frequently, there is also that the proteins have nucleolar (or Cajal body) localization, occasionally for the snoRNAs directly. Thus, it seems that the proper way to be annotating these would actually to be making the annotation to the general level complex term by IDA, making the other additional annotations mentioned above, and then having some way in a GO-CAM model of a complex to indicate that this combination of annotations means you can say that the complex is a more specific form of a box H/ACA complex.

I'm less personally invested in the telomerase complex, but it seems that it is actually useful to have it coded that a major portion of the telomerase complex is structurally identical to the box H/ACA complexes so that it can help users realize that there is a connection there.

ValWood commented 6 years ago

but they aren't all GO:0005732 small nucleolar ribonucleoprotein complex or GO:1990904 ribonucleoprotein complex

Antonialock commented 2 years ago

is there any new thinknig on this issue? I'd like to resolve an annotation dispute.

krchristie commented 2 years ago

Hi @Antonialock - no, sorry, this has completely slipped off my radar. I'll try to get back to it soon, though after spending extra time focusing on ontology development in preparation for the GO meeting, I owe some time back to my annotation responsibilities, so I won't get to this until November.

pgaudet commented 2 years ago

@krchristie Do you have time to look into this one, now the the GOC meeting has passed?

krchristie commented 2 years ago

@krchristie Do you have time to look into this one, now the the GOC meeting has passed?

@pgaudet - I will be out from November 15-26 due to a family event, knee surgery, and Thanksgiving. I will put this on my list to get back to in December.

edwong57 commented 1 year ago

There's a lot here, and I'm trying to catch up. The final outcome for this request is that these terms/changes are being requested?

  1. box C/D RNP complex (GO:new)
  2. box C/D methylation guide RNP complex (GO:new) [for Archaea, euk of unknown location]
  3. box C/D methylation guide scaRNP complex (GO:new)
  4. box C/D methylation guide snoRNP complex (GO:0031428)
  5. box C/D pre-rRNA cleavage RNP complex (GO:new) [appropriate for U3, U8, U13 RNPs]

The requested new term in #1 is already GO:0031428, but it has been suggested that GO:0031428 be renamed to 'box C/D methylation guide snoRNP complex' and a new term, 'box C/D RNP complex' be created.

To make sure I get the parentage correct -

  1. box C/D RNP complex (GO:new) - should be subclass of GO:0005732 (sno(s)RNA-containing ribonucleoprotein complex)
  2. box C/D methylation guide RNP complex (GO:new) [for Archaea, euk of unknown location] - will be subclass of (above new term)
  3. box C/D methylation guide scaRNP complex (GO:new) - will be subclass of (above new term)
  4. box C/D methylation guide snoRNP complex (GO:0031428) - will be subclass of (above new term)
  5. box C/D pre-rRNA cleavage RNP complex (GO:new) [appropriate for U3, U8, U13 RNPs] - will be subclass of (above new term)

Please let me know if this is correct and or if I am missing any information.

edwong57 commented 1 year ago

@ValWood @krchristie @Antonialock, does this look correct to you? Should I go ahead with this?

ValWood commented 1 year ago

I will defer to @krchristie because the existing terms have been working fine for pombe.

Antonialock commented 1 year ago

I will also defer! I just wanted to resolve a lingering dispute in P2GO :-)

edwong57 commented 1 year ago

@krchristie, have you had a chance to look at this?

krchristie commented 1 year ago

@edwong57 - Sorry Edith, I've gotten busy with something else. I'll try to take a look this week.

pgaudet commented 1 year ago

@krchristie Is this still high priority? If so, please can you give us feedback?

edwong57 commented 1 year ago

@krchristie Does this look right to you?

The final outcome for this request is that these terms/changes are being requested?

box C/D RNP complex (GO:new) box C/D methylation guide RNP complex (GO:new) [for Archaea, euk of unknown location] box C/D methylation guide scaRNP complex (GO:new) box C/D methylation guide snoRNP complex (GO:0031428) box C/D pre-rRNA cleavage RNP complex (GO:new) [appropriate for U3, U8, U13 RNPs] The requested new term in https://github.com/geneontology/go-ontology/issues/1 is already GO:0031428, but it has been suggested that GO:0031428 be renamed to 'box C/D methylation guide snoRNP complex' and a new term, 'box C/D RNP complex' be created.

To make sure I get the parentage correct -

box C/D RNP complex (GO:new) - should be subclass of GO:0005732 (sno(s)RNA-containing ribonucleoprotein complex) box C/D methylation guide RNP complex (GO:new) [for Archaea, euk of unknown location] - will be subclass of (above new term) box C/D methylation guide scaRNP complex (GO:new) - will be subclass of (above new term) box C/D methylation guide snoRNP complex (GO:0031428) - will be subclass of (above new term) box C/D pre-rRNA cleavage RNP complex (GO:new) [appropriate for U3, U8, U13 RNPs] - will be subclass of (above new term)

Please let me know if this is correct and or if I am missing any information.

krchristie commented 1 year ago

@krchristie Does this look right to you?

The final outcome for this request is that these terms/changes are being requested?

box C/D RNP complex (GO:new) box C/D methylation guide RNP complex (GO:new) [for Archaea, euk of unknown location] box C/D methylation guide scaRNP complex (GO:new) box C/D methylation guide snoRNP complex (GO:0031428) box C/D pre-rRNA cleavage RNP complex (GO:new) [appropriate for U3, U8, U13 RNPs] The requested new term in #1 is already GO:0031428, but it has been suggested that GO:0031428 be renamed to 'box C/D methylation guide snoRNP complex' and a new term, 'box C/D RNP complex' be created. To make sure I get the parentage correct - box C/D RNP complex (GO:new) - should be subclass of GO:0005732 (sno(s)RNA-containing ribonucleoprotein complex) box C/D methylation guide RNP complex (GO:new) [for Archaea, euk of unknown location] - will be subclass of (above new term) box C/D methylation guide scaRNP complex (GO:new) - will be subclass of (above new term) box C/D methylation guide snoRNP complex (GO:0031428) - will be subclass of (above new term) box C/D pre-rRNA cleavage RNP complex (GO:new) [appropriate for U3, U8, U13 RNPs] - will be subclass of (above new term)

Please let me know if this is correct and or if I am missing any information.

That all sounds right to me. Sorry so slow to manage to take a look.

edwong57 commented 7 months ago