biopragmatics / bioregistry

📮 An integrative registry of biological databases, ontologies, and nomenclatures.
https://bioregistry.io
MIT License
115 stars 49 forks source link

Many INSDC bioproject IDs resolve to a useless page (DDBJ issue?) #118

Closed cmungall closed 3 years ago

cmungall commented 3 years ago

Related:

bioproject:PRJNA594403 should be a valid CURIE

But if I resolve https://bioregistry.io/bioproject:PRJNA594403

it goes to http://trace.ddbj.nig.ac.jp/BPSearch/bioproject?acc=PRJNA594403 which says "no project"

(same for identifiers.org or n2t)

I don't understand why it's trying to resolve to DDBJ, the project wasn't even registered there. See this handy guide for hieroglyphics of the INSDC IDs: https://ena-docs.readthedocs.io/en/latest/submit/general-guide/accessions.html

I don't know what is up with DDBJ, do we have a contact there?

I suggest resolving INSDC IDs using EBI or NCBI just now. E.g. this works:

https://www.ncbi.nlm.nih.gov/bioproject/?term=PRJNA594403

cthoyt commented 3 years ago

Hi Chris, thanks for the feedback. I just updated the default provider in 336ebd6 based on your suggestion. I'm not sure why it resolves this way. Unfortunately, it doesn't send a 404 error code so the tests that check that all of the providers are working never failed on this one. Is there anything else I should do to resolve this one?

cmungall commented 3 years ago

Thanks! There are broader questions re INSDC vs partner dbs but these are covered in other tickets so I will close