OmniSearch / ncro

Non-Coding RNA Ontology
Creative Commons Attribution 4.0 International
5 stars 3 forks source link

Possible errors in current NCRO #27

Closed alanruttenberg closed 8 years ago

alanruttenberg commented 8 years ago

The list below was created by doing a query on NCRO subclasses of miRNA. In each case that the term had a dbxref, the current label(s) in miRBase version 21 were looked up. When there was any mismatch, I printed the miRBase accession, the current labels in miRBase, the current NCRO label, and the accession looked up in miRBase, for the NCRO label.

I'm not sure what the curation process was earlier, but I suspect, from the look of them, that they are typos, or perhaps in some cases splits (a single entry in miRBase is split into 2).

Before I correct these, I wanted a sanity check from someone.

The function is check-ncro-mirna-against-mirbase in mirbase.lisp

Accession    miRBase labels          NCRO label  Accession from NCRO label
MI0023561   hsa-mir-3690-2          hsa-mir-369 MI0000777
MI0015980   hsa-mir-1302-11         hsa-mir-1302-1  MI0006362
MI0022217   hsa-mir-6505            hsa-mir-650 MI0003665
MI0021274   hsa-mir-6129            hsa-mir-612 MI0003625
MI0025513   hsa-mir-6516            hsa-mir-651 MI0003666
MI0019304   hsa-mir-5697            hsa-mir-569 MI0003576
MI0019140   hsa-mir-5583-2          hsa-mir-558 MI0003564
MI0003592   hsa-mir-585         non_mammalian_miRNA NIL
MI0029321   hsa-mir-548bb           hsa-mir-548b    MI0003596
MI0019114   hsa-mir-4524b           hsa-mir-452 MI0001733
MI0019593   hsa-mir-5701-2          hsa-mir-570 MI0003577
MI0017297   hsa-mir-4667            hsa-mir-466 MI0014157
MI0016833   hsa-mir-548x-2          hsa-mir-548x    MI0014244
MI0022548   hsa-mir-6715a           hsa-mir-671 MI0003760
MI0019293   hsa-mir-5681b           hsa-mir-568 MI0003574
MI0005761   hsa-mir-939         hsa-mir-93  MI0000095
MI0006318   hsa-mir-1228            hsa-mir-122 MI0000442
MI0015977   hsa-mir-1972-2          hsa-mir-197 MI0000239
MI0017320   hsa-mir-1343            hsa-mir-134 MI0000474
MI0014197   hsa-mir-1260b           hsa-mir-126 MI0000471
MI0008195   hsa-mir-1827            hsa-mir-182 MI0000272
MI0016849   hsa-mir-4488            hsa-mir-448 MI0001637
MI0031514   hsa-mir-3670-4          hsa-mir-367 MI0000775
MI0015983   hsa-mir-4315-2          hsa-mir-431 MI0001721
MI0016008   hsa-mir-3618            hsa-mir-361 MI0000760
MI0016014   hsa-mir-3622b           hsa-mir-362 MI0000762
MI0008336   hsa-mir-1915            hsa-mir-191 MI0000465
MI0023563   hsa-mir-6089-2          hsa-mir-608 MI0003621
MI0010633   hsa-mir-2114            hsa-mir-211 MI0000287
MI0017299   hsa-mir-2964a;hsa-mir-219b          hsa-mir-21  MI0000077
MI0011285   hsa-mir-2278            hsa-mir-22  MI0000078
MI0013006   hsa-mir-2861            hsa-mir-28  MI0000086
MI0021279   hsa-mir-6134            hsa-mir-613 MI0003626
MI0024976   hsa-mir-7641-2          hsa-mir-764 MI0003944
MI0031510   hsa-mir-3179-4          hsa-mir-31  MI0000089
MI0006657   hsa-mir-1324            hsa-mir-132 MI0000449
MI0014253   hsa-mir-3202-2          hsa-mir-32  MI0000090
MI0018003   hsa-mir-1273g           hsa-mir-127 MI0000472
harrisonstrachan commented 8 years ago

I used a c# program to extract the accession and miRNA from both ncro.owl and hsa.gff3 then find miRNA with different accession numbers. I got similar results (excluding non_mammalian_miRNA):

Accession        miRBase labels      NCRO label
MI0005761       hsa-mir-939         hsa-mir-93
MI0006318       hsa-mir-1228        hsa-mir-122
MI0006657       hsa-mir-1324        hsa-mir-132
MI0008195       hsa-mir-1827        hsa-mir-182
MI0008336       hsa-mir-1915        hsa-mir-191
MI0010633       hsa-mir-2114        hsa-mir-211
MI0011285       hsa-mir-2278        hsa-mir-22
MI0013006       hsa-mir-2861        hsa-mir-28
MI0014197       hsa-mir-1260b       hsa-mir-126
MI0014253       hsa-mir-3202-2      hsa-mir-32
MI0015977       hsa-mir-1972-2      hsa-mir-197
MI0015980       hsa-mir-1302-11     hsa-mir-1302-1
MI0015983       hsa-mir-4315-2      hsa-mir-431
MI0016008       hsa-mir-3618        hsa-mir-361
MI0016014       hsa-mir-3622b       hsa-mir-362
MI0016833       hsa-mir-548x-2      hsa-mir-548x
MI0016849       hsa-mir-4488        hsa-mir-448
MI0017297       hsa-mir-4667        hsa-mir-466
MI0017299       hsa-mir-219b        hsa-mir-21
MI0017320       hsa-mir-1343        hsa-mir-134
MI0018003       hsa-mir-1273g       hsa-mir-127
MI0019114       hsa-mir-4524b       hsa-mir-452
MI0019140       hsa-mir-5583-2      hsa-mir-558
MI0019293       hsa-mir-5681b       hsa-mir-568
MI0019304       hsa-mir-5697        hsa-mir-569
MI0019593       hsa-mir-5701-2      hsa-mir-570
MI0021274       hsa-mir-6129        hsa-mir-612
MI0021279       hsa-mir-6134        hsa-mir-613
MI0022217       hsa-mir-6505        hsa-mir-650
MI0022548       hsa-mir-6715a       hsa-mir-671
MI0023561       hsa-mir-3690-2      hsa-mir-369
MI0023563       hsa-mir-6089-2      hsa-mir-608
MI0024976       hsa-mir-7641-2      hsa-mir-764
MI0025513       hsa-mir-6516        hsa-mir-651
MI0029321       hsa-mir-548bb       hsa-mir-548b
MI0031510       hsa-mir-3179-4      hsa-mir-31
MI0031514       hsa-mir-3670-4      hsa-mir-367
alanruttenberg commented 8 years ago

Thanks! Ok, so we will consider these errors. I'm a good way through parsing the flat file fir miRBase. I think I will indeed regenerate the bulk from that, and include some additional information. Perhaps once that is done you can do some testing and validation, Harrison.