biopragmatics / bioregistry

📮 An integrative registry of biological databases, ontologies, and nomenclatures.
https://bioregistry.io
MIT License
116 stars 51 forks source link

Clarifying the distinction between URI expansion and PURL #666

Open matentzn opened 1 year ago

matentzn commented 1 year ago

It is absolutely critical for data integration to know what one way a CURIE should be expanded back to (in terms of URL).

for example, the chebi prefix should always be translated to http://purl.obolibrary.org/obo/CHEBI_ which is what CHEBI.owl is using.

It seems to me after listening to today's workshop that we should consider distinguishing between:

Maybe these really are different.

cmungall commented 1 year ago

🤘

balhoff commented 1 year ago

I agree—I came here to open an issue but found this existing one. I think Bioregistry should somewhere indicate what the canonical expanded URI form is for an identifier. For example, on the GO page (https://bioregistry.io/registry/go) there are a number of ways listed to turn a GO term CURIE into a URL, but nowhere does it state that the officially endorsed and preferred URI identifier form for a GO term is http://purl.obolibrary.org/obo/GO_nnnnnnn.

cthoyt commented 1 year ago

So maybe there are two axes we should annotate:

  1. Official suggestion from first party
  2. This is a PURL

I think it's totally reasonable to keep track of first party annotations. I will be updating the data model in the future also to include a tag if a URI format is appropriate for RDF usage,

Can somebody take responsibility for writing a PR that includes an explanation of what makes something a PURL and what are the curation guidelines for looking at existing URI format strings and determining if they're PURLs? This can go in https://github.com/biopragmatics/bioregistry/blob/main/src/bioregistry/app/templates/meta/summary.html on first try but we could move it later.

matentzn commented 1 year ago

I put it on my list to draft a blurp, but I am in a paper jam atm, so not sure how quick I will get to it.