The-Sequence-Ontology / SO-Ontologies

Collect of SO Ontologies
Creative Commons Attribution 4.0 International
94 stars 37 forks source link

Improve def of gRNA_gene (SO:0001264) #557

Open sjm41 opened 2 years ago

sjm41 commented 2 years ago

What is the SO term name and accession? gRNA_gene (SO:0001264)

Describe what you would like to change. Current def is "A noncoding RNA that guides the insertion or deletion of uridine residues in mitochondrial mRNAs. This may also refer to synthetic RNAs used to guide DNA editing using the CRIPSR/Cas9 system."

  1. Need to change start of this definition to say "A gene that encodes a..."

  2. Also, is this term meant to be the corresponding 'gene' term to the ncRNA term 'guide_RNA (SO:0000602)' (that has synonym of 'gRNA')? But the definition of that term seems rather different: "A short 3'-uridylated RNA that can form a duplex (except for its post-transcriptionally added oligo_U tail (SO:0000609)) with a stretch of mature edited mRNA."

'guide_RNA' is an INSDC term, so maybe @murphyte has insights?

murphyte commented 2 years ago

I don't know this biology very well, and I have been flummoxed by these terms.

For RefSeq annotations, we've traditionally mapped Cajal body-specific RNAs (http://rfam.xfam.org/family/RF00283) to INSDC 'guide_RNA', which in turn is mapped to SO guide_RNA (SO:0000602). INSDC used to not have a separate term for scaRNA, but now it does, so we should probably switch to using that.

Looking at INSDC submitted records, it looks like INSDC guide_RNA is used for CRISPR-related features (e.g. MH683611.1) and mitochondrion-related guide RNAs (FJ416603.1).

There's also SO sgRNA (SO:0001998), which is specific for CRISPR oligos.

So maybe the intent is SO guide_RNA is for biologically-occurring examples, and SO sgRNA is for CRISPR, and INSDC guide_RNA combines both? And with that logic, is it the case that CRISPR sgRNAs never really have genes, so it would be better to name SO:0001264 as guide_RNA_gene?

Proposed Actions:

  1. add INSDC mappings to SO scaRNA (SO:0002095): INSDC_qualifier:sca_RNA, INSDC_feature:ncRNA
  2. rename SO:0001264 as guide_RNA_gene, with gRNA_gene as a synonym, to better match the RNA term
  3. rewrite the definition of SO:0001264 as proposed so it's a gene term, and also change the beginning of the second sentence to say "sgRNAs are a type of synthetic guide_RNA used to..." to provide some linkage to the oligo term but make it clear that it's separate.
sjm41 commented 2 years ago

Some relevant feedback from Zasha Weinberg: _guide_RNA is maybe an overarching term. I think the specific term "guide RNA" in the ontology spreadsheet refers to RNAs in mitochondria of I think trypanosomes that are antisense to mRNA and direct nucleotide insertions, deletions or changes. snoRNAs could also be considered to be guide RNAs to direct pseudouridylation/methylation, and one could make a case for CRISPR RNAs or miRNAs. If guide_RNA is kept as a top-level term, maybe its name could be altered to make it more specific, e.g., RNA_editing_guideRNA

egchristensen commented 1 year ago

@keilbeck Do you have any objections to @murphyte's list of proposed actions?

keilbeck commented 1 year ago

I like this RNA_editing_guide_RNA

Agree with the rest of @murphyte sensible suggestions.

sjm41 commented 1 year ago

Hi @keilbeck Just to clarify, are you suggesting: guide_RNA (SO:0000602) -> RNA_editing_guide_RNA and gRNA_gene (SO:0001264) -> RNA_editing_guide_RNA_gene ?