Closed cthoyt closed 3 years ago
this is great - broader than OBO though (NCBIGene isn't in OBO!)
NCBIGene is a bit of an outlier here since there is potential confusion about this prefix as a whole (e.g. NCBI just call it "gene")
Canonical examples: FlyBase, WormBase, ...
is this any different from the preferred casing issue? We need a way to record that the preferred CURIE for SGD is SGD:Snnnn, not sgd:Snnnnn. Seems trivial but required if string manipulation free merges are to be obtained.
I think this is probably the same as the preferred casing issue. I would like it to be the case that the bioregistry normalized prefix always must match what happens when you normalize the preferred prefix
While the Bioregistry already supports the usage of different "profiles" depending on if you're in ontology world (e.g., you want to use OBO PURLs), if you're in the systems biology modelling world (e.g., you want to use Identifiers.org IRIs), etc.
However, the OBO Foundry had the good idea to store both a canonical prefix and a stylized prefix (i.e., the
preferredPrefix
field). It would be nice to add this to the bioregistry as well to allow for writing them in a stylized way for certain downstream uses, especially to capture prefixes likencbigene
which often is written asNCBIGene
, but does not itself appear in the OBO Foundry registry.