TranslatorSRI / Babel

Babel creates cliques of equivalent identifiers across many biomedical vocabularies.
MIT License
8 stars 2 forks source link

Add Reactome ReferenceRNASequence Identifiers #156

Open wumirose opened 1 year ago

wumirose commented 1 year ago

These ReferenceSequences are not normalizing

s/n databaseName/example source
-. ReferenceRNASequence
1 "EMBL" eg. "EMBL:AY446894" https://reactome.org/content/detail/R-HCY-9614372
2 "ENSEMBL" eg. ENSEMBL:ENSOCUP00000014932, ENSEMBL:ENSRNOP00000006671 https://reactome.org/content/detail/R-OCU-5610377, https://reactome.org/content/detail/R-RNO-420969
3 "miRBase" eg. miRBase:MI0003646, miRBase:MI0000074 https://reactome.org/content/detail/R-HSA-9624909, https://reactome.org/content/detail/R-HSA-8945692
4 "NCBI Nucleotide" eg. NCBI Nucleotide:XM_006229567, NCBI Nucleotide:DQ897368, NCBI Nucleotide:NC_004718.3 https://reactome.org/content/detail/R-RNO-9021225, https://reactome.org/content/detail/R-RNO-9021231, https://reactome.org/content/detail/R-COV-9685917
5 "NCBI Entrez Gene" eg. NCBI Entrez Gene:13229470 https://reactome.org/content/detail/R-HCY-9638979,
6 "RNAcentral" eg. RNAcentral:URS000022DD4A_9606, RNAcentral:URS000072540A_9606 https://reactome.org/content/detail/R-HSA-9708409, https://reactome.org/content/detail/R-HSA-9708190
-. ReferenceDNASequence
1 "ENSEMBL" eg. ENSEMBL:ENST00000619177 https://reactome.org/content/detail/R-HSA-6790030
2 "EMBL" eg. "EMBL:AY446894" https://reactome.org/content/detail/R-HCY-9614372
3 "NCBI Nucleotide"
-. ReferenceGeneProductSequence
1 "NCBI_Protein" eg. NCBI_Protein:74003783, NCBI_Protein:XP_422365 https://reactome.org/content/detail/R-CFA-427880, https://reactome.org/content/detail/R-GGA-1169577
2 "PRF" eg. PRF:2003375B https://reactome.org/content/detail/R-OCU-421933

The EntityWithAccessionedSequence has 4 types- [Protein; Gene and Transcript; DNA Sequence; and RNA Sequence]. All except the RNASequence and a few DNASequence/Gene normalized with UniProtKB, ENSEMBL:ENSG, .... prefixes. We need to include other unnormalized identifiers with ENSEMBL:ENSOCUP.., ENSEMBL:ENST.., RNAcentral:...., prefixes.

cbizon commented 1 year ago

@wumirose can you provide a bit more context? Where did you find these identifiers and what is the effect of these not normalizing? That will help us prioritize this work.

wumirose commented 1 year ago

@wumirose can you provide a bit more context? Where did you find these identifiers and what is the effect of these not normalizing? That will help us prioritize this work.

I added the sources and a few more details. I hope it helps.