dictyBase / Migration

Entrypoint for dictybase overhaul project
0 stars 0 forks source link

RNA annotations / ENA /RNA Central #35

Open pfey03 opened 9 years ago

pfey03 commented 9 years ago

Dicty RNAs are in ENA and thus in RNA Central. Protein2GO now recognizes RNA central IDs, so we can annotate RNAs there. However, there are some mapping and taxonomy issues that first need to be worked out.

ENA has Dicty Gene IDs but it's not obvious from their website (one can search for gene IDs though, e.g. DDB_G0295491 and find: http://www.ebi.ac.uk/ena/data/view/Non-coding:AAFI02000003.1:1292629..1292834:ncRNA

ENA then links to RNA Central: http://rnacentral.org/rna/URS00003FF759

1) the RNA Central ID links through ENA to our gene ID. The ENA ID is not clear, they use the GenBank ID, the chromosomal location, and the molecule class to describe, however, one best finds single genes by our Gene ID. We need to ask them about clear mapping to our gene IDs.

2) ENA has RNA annotations to two taxons: The root D. discoideum, we always annotate to: 44689 And they have often the better annotations to D. discoideum AX4 taxon: 352472 (as that GenBank file is more comprehensive). Note that the link above is to taxon 352472, However, RNA Central ID URS00003FF759 (for example) maps to both taxons. Compare http://www.ebi.ac.uk/ena/data/view/Non-coding:AAFI02000003.1:1292629..1292834:ncRNA http://www.ebi.ac.uk/ena/data/view/Non-coding:AJ699380.1:1..206:ncRNA In Protein2GO when entering the RNA Central ID one needs to choose the taxon ID!

Curators cannot yet annotate to RNA Central IDs until we worked out the mapping issues so we can show the annotations in dictyBase.

There is a related issue about RNA Central here: https://github.com/dictyBase/RNA-Export/issues/1

pfey03 commented 6 years ago

Maybe we should just treat it like any ID, for example: URS00003FF759_352472 http://rnacentral.org/rna/URS00003FF759/352472 \This is also in P2GO and who cares if there is the AX4 taxon included