The-Sequence-Ontology / SO-Ontologies

Collect of SO Ontologies
Creative Commons Attribution 4.0 International
92 stars 37 forks source link

miRNA precursor molecule [sf#87] #87

Closed srynobio closed 9 years ago

srynobio commented 9 years ago

Reported by aschroeder on 2008-02-15 17:57 UTC Currently there exists miRNA and miRNA_primary_transcript. However, there doesn't appear to be a term to describe the precursor miRNA stemloop molecule. This should probably be isa ncRNA and have derives_from miRNA_primary_transcript.

And then miRNA derives from miRNA_precursor or whatever name is chosen.

I include a bit cribbed from the online worm book that describes the biology and a bit about suggested nomenclature and annotation rules (from a paper linked from the miRNA registry site).

Biogenesis from online Wormbook

"Generally, miRNAs are transcribed, some by RNA polyermase II (Bartel, 2004; Lee et al., 2004a), as longer poly-adenylylated primary miRNAs (pri-miRNAs) molecules of about 1 kilobase or greater (Bracht et al., 2004; Lee et al., 2002). The pri-miRNA is further processed by the RNase III endonuclease Drosha (drsh-1 in C. elegans; Bracht et al., 2004; Lee et al., 2003) to the precursor miRNA (pre-miRNA), a 60-70 nt molecule that can fold back on itself and form a hairpin loop structure. The pre-miRNA is then transported out of the nucleus into the cytoplasm by Exportin-5 (Yi et al., 2003). In the cytoplasm another RNase III enzyme, Dicer (dcr-1 in C. elegans), processes the pre-miRNA to the mature 20-25 nt miRNA (Bernstein et al., 2001; Grishok et al., 2001). Mature miRNAs bind to imperfectly complementary sequences in the 3' untranslated regions (UTRs) of target mRNAs to negatively regulate target gene expression. miRNAs are thus believed to function as guides to recruit a silencing complex to target mRNAs, but the exact mechanism used by miRNAs to down-regulate gene expression is unknown."

Nomenclature/Annotation from Ambros et al. (2003) RNA 9:277

"MicroRNAs come from endogenous transcripts that can form local hairpin structures, which ordinarily are processed such that a single miRNA molecule accumulates from one arm of a hairpin precursor molecule. Sometimes the primary transcript contains multiple hairpins, and different hairpins give rise to a different miRNA. These are considered polycistronic miRNA transcripts, and each hairpin is given a unique gene name."

"...it is recommended that each hairpin structure be considered a single gene, and all mature miRNAs from that hairpin should be annotated as alternatively processed gene products."

"microRNAs are named using the "miR" prefix and a unique identifying number (eg., miR-1 ...."

"...with identical miRNAs having the same number regardless of organism. Nearly identical orthologs can also be given the same number, at the discretion of the researcher."

"Identical or very similar miRNA sequences within a speces can also be given the same number, with their genes distinguished by ltetter and/or numeral suffixes, according to the convention of the organism (e.g., the ~22-nt transcripts of Drosophila mir-13a and mir-13b are slightly different in sequence, whereas those of mir-6-1 and mir-6-2 are identical; Lagos-Quintana et al. 2001)."

srynobio commented 9 years ago

Commented by batchelorc on 2008-02-18 16:46 UTC Logged In: YES user_id=1473024 Originator: NO

Hello,

So what we need to do here is: create new term (something like)

name: precursor_miRNA def: "An intermediate transcript that is created by processing of a primary miRNA by the RNase III endonuclease Drosha. This pre-miRNA is processed into the mature miRNA by Dicer." synonym: "pre-miRNA" EXACT []

Then. We need to update the def. of miRNA to reflect that it comes from a precursor miRNA, not the miRNA primary transcript.

Finally, miRNA_primary_transcript (SO:0000647) needs: synonym: "pri-miRNA" EXACT []

Not sure about the new term's parentage---there's a case for making it a child of ncRNA, because it is a processed transcript, but I think I'd prefer to create a new term,

name: intermediate_transcript def: "A transcript which undergone some but not all of the necessary modifications to be functional." intersection_of: SO:0000673 ! transcript intersection_of: has_quality SO:0000933 ! intermediate

which would resolve the problem of SO:0000933 having no children.

The derives_from relations can go in as you suggest.

I know Karen's looking at the derives_from relations in general, so I'll wait for Karen to comment.

best wishes, Colin.

srynobio commented 9 years ago

Commented by eilbeck on 2008-02-25 21:00 UTC Logged In: YES user_id=742851 Originator: NO

Hi Sorry for taking so long to reply to this one.

Firstly thanks Andy for the quote "...it is recommended that each hairpin structure be considered a single gene, and all mature miRNAs from that hairpin should be annotated as alternatively processed gene products." I was in a seminar last week when the genomic structure of a multiple miRNA was shown and it sent me into a mild panic about how we were supposed to represent such a thing.

I have looked at: http://www.pubmedcentral.nih.gov/picrender.fcgi?artid=2077053&blobtype=pdf They describe the parts of the primary transcript of intergenic miRNA. They have a cap, poly A tail, TATA box motifs etc. (This is a bit at odds with out current definitions of primary and processed.) Interestingly they say that the pri-miRNA 'includes' the precursor miRNA hairpin. I like this way of thinking about it, rather than derives_from, use part_of. This would be consistent with the way we have handled other parts in the past like introns.

precursor_miRNA part_of primary_miRNA

As far as I can tell we could go further than this as all of the parts are linear label the miRNA_loop and the miRNA_stem and the mature_miRNA. miRNA_loop part_of precursor_miRNA (def - the loop that is cleaved by dicer...) miRNA_stem part_of precursor_miRNA (def - the other bit...) mature_miRNA part_of precursor - synonym miRNA mature_miRNA is_a nc_processed_transcript

By this logic, I think that the primary_miRNA also has regions of it that are removed by drosha. Do we need to name these regions too? drosha_removed_region part-of primary_miRNA??? But these two terms are definitely useful. drosha_site part_of primary_miRNA dicer_site part_of precursor_miRNA

The mature_miRNA is totally contained in the primary_miRNA and nothing is added we can use the part_of relation instead of the derives from. (The derives_from rel is going to be deprecated anyway.

What do you think? Does any of this work?

--K

srynobio commented 9 years ago

Commented by aschroeder on 2008-02-26 14:25 UTC Logged In: YES user_id=660849 Originator: YES

This looks fine to me, although I am not sure if I understand Colin's last comment. Also, just want to make sure that there won't be any problem with the sorts of miRNAs that are generated from processed introns of protein coding gene transcripts. The term mirtron already exists in SO and from these mirtrons are produced mature miRNAs so as long as there are no issues with having multiple part of parentage its likely fine.

cheers, Andy

srynobio commented 9 years ago

Commented by eilbeck on 2008-03-14 19:01 UTC Logged In: YES user_id=742851 Originator: NO

OK, I have added the following terms: pre_miRNA SO:0001244 The 60-70 nucleotide region remain after Drosha processing of the primary transcript, that folds back upon itself to form a hairpin sructure. it is a part of the miRNA_primary_transcript

miRNA_loop The loop of the hairpin loop formed by folding of the pre-miRNA. part_of the pre_miRNA

miRNA_stem The stem of the hairpin loop formed by folding of the pre-miRNA. part_of the pre_miRNA

We are still working out the relationships between transcripts in general, but the terms exist now and should not change.

--Karen

srynobio commented 9 years ago

Updated by eilbeck on 2008-03-14 19:01 UTC