biopragmatics / bioregistry

📮 An integrative registry of biological databases, ontologies, and nomenclatures.
https://bioregistry.io
MIT License
107 stars 47 forks source link

Update NCBI Protein #1061

Open IgorRodchenkov opened 3 months ago

IgorRodchenkov commented 3 months ago

Prefix

ncbiprotein

Explanation

I was digging/cleaning into DrugBank data. They have things like this

<external-identifier>
     <resource>GenBank Protein Database</resource>
     <identifier>33150626</identifier>
 </external-identifier>

I had to try/guess the standard ID collection/prefix to use but I found that e.g. both bioregistry.io/ncbiprotein:33150626 and bioregistry.io/genbank:33150626 get successfully resolved to https://www.ncbi.nlm.nih.gov/protein/33150626.

I think "ncbiprotein:33150626" CURI is wrong and should return a pattern mismatch error instead, no?

Contributor ORCID

Igor Rodchenkov

cthoyt commented 3 months ago

Hi @IgorRodchenkov, thanks for looking into this. NCBI Protein is a special resource - from what I can tell, it is enabling resolving of a variety of other resource types. Can you give a bit more context on what drugbank entry this comes from, and what the expected protein it is supposed to point to?