geneontology / go-ontology

Source ontology files for the Gene Ontology
http://geneontology.org/page/download-ontology
Creative Commons Attribution 4.0 International
222 stars 40 forks source link

term label has changed making it narrower, dos not fit existing annotations #24199

Closed ValWood closed 2 years ago

ValWood commented 2 years ago

Commenting on https://github.com/geneontology/go-ontology/issues/23260

I spotted that the term

ncRNA 3' end processing has a label change to small regulatory ncRNA 3'-end processing

This is incorrect for my usage here https://www.pombase.org/reference/PMID:34389684 and others where the ncRNAs are not "small" some are regulatory long ncRNA

can you make this back to the " ncRNA 3'-end processing" label

ValWood commented 2 years ago

I think we could model this better, alluded to in this ticket https://github.com/geneontology/go-ontology/issues/24161

The regulation of alternative polyadenylation pathway all seem to belong to a class of "cotranscriptional 3’ processing of RNA polymerase II transcripts" (i.e torpedo model)

https://www.researchgate.net/publication/8149761_Shortcuts_to_the_end/figures?lo=1

pgaudet commented 2 years ago

We had discussed to remove 'small' non-coding RNA gtom the goruping terms to be able to group all regulatory nc RNAs - so I will fix

GO:0043628 'small regulatory ncRNA 3'-end processing' -> regulatory ncRNA 3'-end processing' GO:0070918 'small regulatory ncRNA processing' -> -> regulatory ncRNA processing

I think these are the last ones with this label.

Thanks, Pascale

pgaudet commented 2 years ago

@ValWood does that work for you?

There are other ncRNAs, such as tRNAs, snRNAs; were you also annotating these to the general 'ncRNA 3' end processing' term ?

Thanks, Pascale

ValWood commented 2 years ago

No, but I think we are grouping different pathways here.

The pathway I'm using it for is always the RNA polymerase II coupled polyadenylation/termination pathway. I don't think that works on tRNAs snRNAs but I would need to double check that.

krchristie commented 2 years ago

No, but I think we are grouping different pathways here.

The pathway I'm using it for is always the RNA polymerase II coupled polyadenylation/termination pathway. I don't think that works on tRNAs snRNAs but I would need to double check that.

I'm not aware that RNAP II ever transcribes tRNAs.

However, snRNAs are a mix. In S. cerevisiae, if I remember correctly, RNAP II transcribes the snRNAs U1, U2, U4, & U5, while RNAP III transcribes U6. Which RNA polymerase transcribes each of these snRNAs may vary between species. Note that I know nothing about transcription of U1-atac, U2-atac, U4-atac, or U6-atac since they are not present in cerevisiae.

It may also be worth mentioning that transcription of snoRNAs is definitely a mixture, with some done by RNAP II and others by RNAP III, and which RNAP does which definitely varies between species.

ValWood commented 2 years ago

In S. cerevisiae, if I remember correctly, RNAP II transcribes the snRNAs U1, U2, U4, & U5, while RNAP III transcribes U6.

refs for U6 txn by pol II in S.c in here: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5855946/ I wonder if this is a relatively recent thing evolutionarily and does not apply to other eukaryotes? In pombe U6 is spliced, which indicated that in fission yeast this one must be pol II transcribed.

It seems more generally that the other snRNAs are polyadenylated by similar pol II-associated pathways as mRNA and ncRNA (for termination/3' processing and surveillance) http://europepmc.org/article/MED/18951092

Presumably, tRNAs are polyadenylated by a completely different pathway (I can only see papers on tRNA polyadenylation for prokaryotes, or for human mitochondrial tRNAs)

@pgaudet this is why I think it makes more sense to still keep the polymerase groupings, and make these the primary organization, because many of the downstream pathways are polymerase specific (splicing and 3' end processing by polyadenylation for pol II and many processing events for pol I). This will make it easier to join modules later.

pgaudet commented 2 years ago

I thought the proposal was to make 'regulation of transcription' transcript-specific, but the non-regulatory processes (initiation, elongation, termination) and the processing would be polymerase-specific. Would that work ?

ValWood commented 2 years ago

Ah right, I keep getting confused. This sounds OK. I'm finding it difficult to imagine all of the consequences, but this would probably be OK. But what happens with regulation of elongation, regulation of termination etc? are they still polymerase specific? (these are the things that connect to the different pathways like polyadenylation, but I suspect here we are dealing with a lot of 'affects' not regulation). There is definitely different signalling to CTD residues to regulate different termination sites.