biopragmatics / bioregistry

📮 An integrative registry of biological databases, ontologies, and nomenclatures.
https://bioregistry.io
MIT License
119 stars 51 forks source link

make namespace in pattern consistent for OBO #63

Closed cmungall closed 2 years ago

cmungall commented 3 years ago

See also #191

These have ns-in-pattern = False

http://bioregistry.io/registry/obi http://bioregistry.io/registry/ecto http://bioregistry.io/registry/fbbt ...

so the regex only covers the numeric part and the page shows:

Namespace in Pattern
False
Example Identifier
0000001

These have ns-in-pattern = True

http://bioregistry.io/registry/go http://bioregistry.io/registry/cl http://bioregistry.io/registry/uberon

and the regex covers the complete CURIE, and the page (correctly) shows example CURIEs

While technically it doesn't make a difference (other than I guess some performance impact with processing the regexes on large numbers of IDs), I think this should be normalized to True.

I think it's confusing to show OBO CURIEs having Example IDs having only numbers, e.g.

image

I know the legacy of this, with the bioinformatics community being terrible at IDs and being inconsistent at what we mean by ID, ending up with the MGI:MGI issue, but at least within OBO we are clear and consistent, there is one canonical CURIE form, and one canonical URI.

cthoyt commented 2 years ago

I've addressed this by making several tweaks to the interface.

  1. The namespace embedded in LUI is now only discussed in the context of maintaining compatibility with MIRIAM (at least when it's self-consistent)
  2. The word "identifier" is no longer used alone. Now it's listed as "local unique identifier" which I don't think anyone can argue against writing without the prefix.
  3. There is now an example CURIE that shows what people are more likely expecting for OBO

Screenshot from https://bioregistry.io/registry/chebi:

Screen Shot 2022-01-05 at 12 47 34