Closed joeflack4 closed 2 years ago
First, it's interesting to note you are hacking in an ICD10WHO
prefix since that's not in the Bioregistry. There's an ongoing discussion about ICD prefixes in https://github.com/biopragmatics/bioregistry/issues/251 and a list of existing ICD prefixes at https://bioregistry.io/collection/0000004. I'm happy to accept suggestions to add additional prefixes if there's a compelling case why they are different from existing prefixes (though nobody has yet written their thoughts in a cohesive, actionable way).
>>> curie_from_iri( 'https://icd.who.int/browse10/2019/en#/R63.8', prefix_map={'ICD10WHO': 'https://icd.who.int/browse10/2019/en#/'}, ) 'ICD10WHO:R63.8' >>> curie_from_iri( '<https://icd.who.int/browse10/2019/en#/R63.8>', prefix_map={'ICD10WHO': 'https://icd.who.int/browse10/2019/en#/'}, ) None
Valid IRIs don't have chevrons <>
around them. Perhaps these are artifacts from directly reading an RDF document? You can simply strip your string s.lstrip("<").rstrip(">")
so that you can retrieve a valid IRI during pre-processing of your data. I think it's reasonable for the bioregistry.curie_from_iri()
to continue to accept only valid IRIs, so we're not going to do anything to address this within the Bioregistry package.
Perhaps the eventual solution is to refactor
bioregistry
to useoaklib
for this.
The Bioregistry is a general tool, and oaklib
is an ontology-specific tool (with many OBO-specific and even project-specific assumptions) so this doesn't make sense. That being said, there are a lot of tools built in to the Bioregistry to support ontology/OBO-specific use cases.
Further, the Bioregistry doesn't have any major dependencies for its normal functionality, and it is advantageous to keep it that way so it can be better integrated in other projects.
First, it's interesting to note you are hacking in an ICD10WHO prefix since that's not in the Bioregistry.
Nico recommended that I use bioregistry
as more of a library in the short term, like OAK. I understand that this isn't the primary use case.
Regarding the ICD10WHO prefix itself, it tends to be the same as "ICD10". However for Mondo work, we decided to add the WHO part for disambiguation, as we were having issues where sometimes ICD10 was ICD10CM, and other times it was the WHO variation (and perhaps other variations). This IMO is a mistake on WHO's end.
ongoing discussion about ICD prefixes in https://github.com/biopragmatics/bioregistry/issues/251 ... I'm happy to accept suggestions to add additional prefixes
I looked at the issue, and I remember seeing that before. Actually, it looks like the ICD10WHO prefix name is thoroughly discussed in there. As far as the URI goes, Mondo has moved from https://icd.who.int/browse10/2010/en#/
to https://icd.who.int/browse10/2019/en#/
(updated year). Actually @matentzn unfortunately this does not seem very stable, as the year seems somewhat arbitrary and unstable. We don't maintain ICD10WHO of course, so not sure if there is anything better that we can do other than periodically use the latest browser as our prefix URI.
Valid IRIs don't have chevrons <> around them. Perhaps these are artifacts from directly reading an RDF document? You can simply strip your string... ...That being said, there are a lot of tools built in to the Bioregistry to support ontology/OBO-specific use cases.
They are and I eventually did. I was recommended to use this for a library use case; I understand that that isn't the primary intended use case. But it looks like you are saying (RE: ontology/OBO) that ontology engineering library functions are an intended use case. If so, then working around these chevrons should be supported.
Bioregistry doesn't have any major dependencies
I'm surprised. But I hear you there. If there's not a major gain to including additional dependencies, might as well leave out.
Overview
I know that we are moving to OAK, but I've been having a lot of trouble getting my OAK use cases to work so far, and @matentzn recommended that I try this bioregistry function for now.
Example
Possible solutions
Perhaps the eventual solution is to refactor
bioregistry
to useoaklib
for this.