Open cmungall opened 2 months ago
I have not absorbed your proposal quite yet, but
bioregistry conflates URLs for humans with semantic URIs
While this is mostly true its not quite true conceptually:
"goche": {
"contributor": {
"email": "cthoyt@gmail.com",
"github": "cthoyt",
"name": "Charles Tapley Hoyt",
"orcid": "0000-0003-4423-4370"
},
"description": "Represent chemical entities having particular CHEBI roles",
"download_owl": "https://raw.githubusercontent.com/geneontology/go-ontology/master/src/ontology/imports/chebi_roles.owl",
"example": "25512",
"homepage": "https://github.com/geneontology/go-ontology",
"name": "GO Chemicals",
"pattern": "^\\d+$",
"preferred_prefix": "GOCHE",
"rdf_uri_format": "http://purl.obolibrary.org/obo/GOCHE_$1",
"references": [
"https://obo-communitygroup.slack.com/archives/C023P0Z304T/p1638472847049400",
"https://github.com/geneontology/go-ontology/issues/19535"
],
"repository": "https://github.com/geneontology/go-ontology",
"synonyms": [
"go.chebi",
"go.chemical",
"go.chemicals"
],
"uri_format": "https://biopragmatics.github.io/providers/goche/$1"
},
Check rdf_uri_format
.
This does not entirely change the issue, just adding an additional layer.
There is frequently a need to represent entities from a database as an ontology
See:
There are a lot of factors to condense here but some key points
I propose that the bioregistry datamodel is extended to include inlined sub-records for ontology or KG translations of databases. These subrecords would have additional metadata to indicate the source (3rd party vs official vs quasi-official)
One case would be 3rd party ontology rendering with reminted prefixed IDs:
These renderings could even be first class entries as far as the bioregistry UI is concerned, e.g.
obo$NCBITaxon
(but obviously this wouldn't be used as a prefix)Another would be 3rd part ontology renderings where the same prefixes and URL expansions are used:
here there is no bespoke prefixmap so the standard RHEA ones would be used.
perhaps controversially:
here this would be a link between 2 existing overlapping bioregistry entries
This scheme could also be used for KG renderings of databases in formats that are more suited than OWL (e.g. kgx, rdfstar with owlstar semantics)
Note that in cases for entries that are "born" ontologies we would not curate this info, this would be considered a reflexive relation