OmniSearch / ncro

Non-Coding RNA Ontology
Creative Commons Attribution 4.0 International
5 stars 3 forks source link

Undefined domain-specific relations #24

Open alanruttenberg opened 8 years ago

alanruttenberg commented 8 years ago

The following relations do not have textual definitions. Some seem obvious by their labels, but I never trust that. I'll be trying to define these based on the paper and citation tracking, but contributions are welcome.

The one I was particularly unclear of was is_about_grouped_miRNA. Tracking this down.

hsa-mir-16-1

community annotation: (taken from wikipedia)

The miR-16 microRNA precursor family is a group of related small non-coding RNA genes that regulates gene expression. miR-16, miR-15, mir-195 and miR-457 are related microRNA precursor sequences from the mir-15 gene family.

This is documentation for a family. I presume the family is the one linked to by the "Gene Family" entry on that page.

Wikipedia defines gene family as follows:

A gene family is a set of several similar genes, formed by duplication of a single original gene, and generally with similar biochemical functions.

I speculate that "grouped" miRNA is meant to mean that the group is formed by membership in a gene family, with the source being mirBase.

Please confirm.

If so, then it makes sense that the family would be a class in the same sense as PRO's "Family level distinction"

is_classified_into_gene_family_group is_gene_template_of_mRNA is_predicted_target_of mirRNA_expressed_in_tissue is_model_of_disease is_about_mirRNA is_about_mirRNA_target_gene is_about_grouped_miRNA has_predicted_target

Huang-OMIT commented 8 years ago

Asiyah: Could you confirm the point raised by Alan? Thanks!

Huang-OMIT commented 8 years ago

Alan: If I am not mistaken, "is_classified_into_gene_family_group" was already replaced by "is_classified_into_gene_family" because the class "gene_family_group" was changed to "gene_family"

Huang-OMIT commented 8 years ago

Checked just now.

The class was changed: http://www.ontobee.org/ontology/NCRO?iri=http://purl.obolibrary.org/obo/NCRO_0001667

But the relation has not yet been changed. Need to fix.

linikujp commented 8 years ago

The miRNA family is grouped following some algorithm, such as gene sequence alignment or something else. Similar with protein family, the member of miRNA family are miRNAs that are considered 'homologous in structure and function'.

I am not sure the gene family's definition here. If you consider miRNA is kind of gene, then Alan's statement maybe correct: 'the group is formed by membership in a gene family, with the source being mirBase.'

alanruttenberg commented 8 years ago

Where's the quote from? As I understand it, being homologous is a weaker constraint then being having the same function or structure. The former traces evolution, but during evolution structure and function can diverge. For example, in this article they say:

Technological improvements have resulted in increased discovery of new microRNAs (miRNAs) and refinement and enrichment of existing miRNA families. miRNA families are important because they suggest a common sequence or structure configuration in sets of genes that hint to a shared function

That would suggest that gene families, as understood, are related by homology (orthologs or paralogs).

I would then understand the miRNA families to be the transcription products of those genes (which may undergo change in time).

This understanding would mean that both 'pre-miRNA' (direct modification) and the mature miRNA, which are post-transcriptionally modified, would be members of the family. This isn't the case in NCRO at the moment.

If the current NCRO follows the standard nomenclature, also then most of the miRNA in NCRO are 'precursor' miRNA) with there being only three types of mature miRNA - hsa-miR-125b-1-3p hsa-miR-125b-3-3p hsa-miR-125b-5p. The immature versions hsa-mir-125b-1 and hsa-mir-125b-2 are also in NCRO, in the family mir-10. It seems the families defined by MIRBASE only contain precursor miRNA. By my definition (and following PRO's convention), the mature forms would also be.

So we would have

 mir-10 miRNA [4]
   hsa-mir-125a miRNA [1] (all miRNA gene products of the gene MIR-125a)
     hsa-mir-125a       [4]      (precursor)
     hsa-miR-125a-3p [3]     (mature) 
     hsa-miR-125a-5p [3]    (mature)
   hsa-mir-125b miRNA [1]  (all miRNA gene products of the gene MIR-125b and duplicate genes)
     hsa-miR-125b-5p [2] [4] (mature)
     hsa-mir-125b-1 [4]         (precursor, 1st copy of gene)
     hsa-miR-125b-1-3p [4]  (mature)
     hsa-mir-125b-2  [4]       (precursor, 2nd copy of gene) 
     hsa-miR-125b-2-3p [4] (mature)

Notes

  1. Proposed to add to NCRO
  2. I don't understand why there's not a -1 or a -2 . Maybe both -1 and -2 5' arm end sequences are the same?
  3. In MIRBASE but not NCRO In NCRO
  4. in NCRO
    • precursor means it could be pri or pre-miRNA
    • hsa-mir-125a miRNA are all products of one gene (including precursor and mature)
    • hsa-mir-125b miRNA are all products of a different, but homologous gene, with more than one exact copy(including precursor and mature)
    • hsa-mir-125b-1,2 are derived from two exactly duplicated genes
    • hsa-miR-125b-[1,2]-[3|5]p,hsa-miR-125a-[3|5]p and hsa-miR-125b-5p are mature products
    • I could have created hsa-mir-125b-1 miRNA, and hsa-mir-125b-1 miRNA analogous to hsa-mir-125a but I chose not to here. Could be changed.

That was confusing to organize. Please check it in detail.

alanruttenberg commented 8 years ago

Also see #9

linikujp commented 8 years ago

Regarding to what is a miRNA family, I am not an expert in miRNA per se. Based on my own knowledge, I think the consensus of the biologist understanding is that a protein family or gene family based on its homologous sequence or function. It's better to confirm with a biologist or bioinformatics who works with miRNA sequence analysis closely.