Closed matentzn closed 5 months ago
You might need to share more details about your environment, because I cannot replicate here.
Tried in a clean virtualenv with the latest oaklib 0.5.25
, it works just fine.
Also tried with a clean virtualenv setup with babelon 0.2.4
(in case the problem came from a Babelon-specific dependency), same: no errors at all.
The joy of pip install -U. :/ Thanks for making me think in this direction (other dependencies). It was, indeed, an older 0.6.X curies
version that caused the issue. Sorry about the noise.
Reopening as it was indeed an issue. This does not work:
from oaklib import get_adapter
example = """
format-version: 1.2
data-version: hp/releases/2024-02-25
default-namespace: human_phenotype
idspace: dc http://purl.org/dc/elements/1.1/
idspace: oboInOwl http://www.geneontology.org/formats/oboInOwl#
idspace: owl http://www.w3.org/2002/07/owl#
idspace: rdf http://www.w3.org/1999/02/22-rdf-syntax-ns#
idspace: rdfs http://www.w3.org/2000/01/rdf-schema#
idspace: terms http://purl.org/dc/terms/
idspace: xml http://www.w3.org/XML/1998/namespace
idspace: xsd http://www.w3.org/2001/XMLSchema#
ontology: hp.obo
[Term]
id: HP:0000001
name: All
"""
file_path = "example.obo"
# Open the file in write mode ('w'). This will create the file if it does not exist
# or overwrite it if it does.
with open(file_path, 'w') as file:
# Write the string to the file
file.write(example)
adapter = get_adapter("pronto:example.obo")
m = adapter.entity_metadata_map("HP:0000001")
print(m)
If you remove
idspace: oboInOwl http://www.geneontology.org/formats/oboInOwl#
it does. This suggests that we need to somehow handle this for the day when @balhoff PR is merged.
As far as I understand, the problem is as follows:
1) The BasicOntologyInterface’s prefix_map()
default implementation creates a default prefix map made of the “OBO context”. Presumably the OBO context map contains an entry oio -> http://www.geneontology.org/formats/oboInOwl#
.
2) The ProntoImplementation’s __post_init__()
method adds to that default prefix map the prefixes declared in the OBO file’s idspace
tags:
for prefix, expansion in ontology.metadata.idspaces.items():
self.prefix_map()[prefix] = expansion[0]
(The SimpleOboImplementation does the same thing.)
3) Now the prefix map contains both oio -> http://www.geneontology.org/formats/oboInOwl#
(from the OBO context) and oboInOwl -> http://www.geneontology.org/formats/oboInOwl#
(from the ontology’s own map).
4) The curies converter does not like that at all and error out.
I am not sure I understand why having two prefix names pointing to the same prefix must be an error. I understand that the other way round (the same prefix name pointing to two different prefixes) would obviously be wrong (but that cannot happen here, since existing prefix names in the OBO context would be automatically replaced by the declared prefix name), but not in that direction.
Anyway, if we indeed consider that it is wrong to have two prefix names pointing to the same URL prefix, both the Pronto and the SimpleOBO implementation must be amended because the 2-lines code highlighted above is too naive: instead of simply adding the content of the idspace
declaration to the existing prefix map, it must before check whether the prefix map already contains another prefix name pointing to the same URL prefix, and remove it.
By the way, anyone could run into this problem anytime, independently of @balhoff ’s PR. His PR merely makes it more likely to come across OBO files containing idspace
tags, but anyone can already craft OBO files with such tags if they want.
Solution to this: In basic_ontology_interface.py, this line needs to be
self._converter = curies.Converter.from_prefix_map(self.prefix_map(), strict=False)
This asks the curies
package to be less strict and allow duplicate prefixes. As you can see it's an easy fix.
The questions are:
strict=True
as of now obviously following the lead from the curies
package)cc: @cmungall
Another possible fix would be to fix
So the way the prefixmap is contracted. If the way we use it in sssom-py was used (with ChainMap) it would allow the creating of a prefixmap with precedence rules that would result in a consistent final product. I assume that having conflicting prefixmaps (multiple prefixes for the same URI) could be confusing for the day to day busines.s.
Version: oaklib 0.5.25
Replicates with both pronto and simpleobo adapters
Minimal test
Error: DuplicateURIPrefixes
``` DuplicateURIPrefixes Traceback (most recent call last) Cell In[16], [line 22](vscode-notebook-cell:?execution_count=16&line=22) [19](vscode-notebook-cell:?execution_count=16&line=19) file.write(example) [21](vscode-notebook-cell:?execution_count=16&line=21) adapter = get_adapter("simpleobo:example.obo") ---> [22](vscode-notebook-cell:?execution_count=16&line=22) m = adapter.entity_metadata_map("HP:0000001") File [~/.pyenv/versions/3.11.7/envs/babelon/lib/python3.11/site-packages/oaklib/implementations/simpleobo/simple_obo_implementation.py:620](~/.pyenv/versions/3.11.7/envs/babelon/lib/python3.11/site-packages/oaklib/implementations/simpleobo/simple_obo_implementation.py:620), in SimpleOboImplementation.entity_metadata_map(self, curie) [618](~/.pyenv/versions/3.11.7/envs/babelon/lib/python3.11/site-packages/oaklib/implementations/simpleobo/simple_obo_implementation.py:618) m[DEPRECATED_PREDICATE].append(True) [619](~/.pyenv/versions/3.11.7/envs/babelon/lib/python3.11/site-packages/oaklib/implementations/simpleobo/simple_obo_implementation.py:619) m[HAS_OBSOLESCENCE_REASON].append(TERMS_MERGED) --> [620](~/.pyenv/versions/3.11.7/envs/babelon/lib/python3.11/site-packages/oaklib/implementations/simpleobo/simple_obo_implementation.py:620) self.add_missing_property_values(curie, m) [621](~/.pyenv/versions/3.11.7/envs/babelon/lib/python3.11/site-packages/oaklib/implementations/simpleobo/simple_obo_implementation.py:621) return dict(m) File [~/.pyenv/versions/3.11.7/envs/babelon/lib/python3.11/site-packages/oaklib/interfaces/basic_ontology_interface.py:1460](~/.pyenv/versions/3.11.7/envs/babelon/lib/python3.11/site-packages/oaklib/interfaces/basic_ontology_interface.py:1460), in BasicOntologyInterface.add_missing_property_values(self, curie, metadata_map) [1458](~/.pyenv/versions/3.11.7/envs/babelon/lib/python3.11/site-packages/oaklib/interfaces/basic_ontology_interface.py:1458) if PREFIX_PREDICATE not in metadata_map: [1459](~/.pyenv/versions/3.11.7/envs/babelon/lib/python3.11/site-packages/oaklib/interfaces/basic_ontology_interface.py:1459) metadata_map[PREFIX_PREDICATE] = [prefix] -> [1460](~/.pyenv/versions/3.11.7/envs/babelon/lib/python3.11/site-packages/oaklib/interfaces/basic_ontology_interface.py:1460) uri = self.curie_to_uri(curie, False) [1461](~/.pyenv/versions/3.11.7/envs/babelon/lib/python3.11/site-packages/oaklib/interfaces/basic_ontology_interface.py:1461) if uri: [1462](~/.pyenv/versions/3.11.7/envs/babelon/lib/python3.11/site-packages/oaklib/interfaces/basic_ontology_interface.py:1462) if URL_PREDICATE not in metadata_map: File [~/.pyenv/versions/3.11.7/envs/babelon/lib/python3.11/site-packages/oaklib/interfaces/basic_ontology_interface.py:240](~/.pyenv/versions/3.11.7/envs/babelon/lib/python3.11/site-packages/oaklib/interfaces/basic_ontology_interface.py:240), in BasicOntologyInterface.curie_to_uri(self, curie, strict) [238](~/.pyenv/versions/3.11.7/envs/babelon/lib/python3.11/site-packages/oaklib/interfaces/basic_ontology_interface.py:238) raise ValueError(f"Invalid CURIE: {curie}") [239](~/.pyenv/versions/3.11.7/envs/babelon/lib/python3.11/site-packages/oaklib/interfaces/basic_ontology_interface.py:239) return None --> [240](~/.pyenv/versions/3.11.7/envs/babelon/lib/python3.11/site-packages/oaklib/interfaces/basic_ontology_interface.py:240) rv = self.converter.expand(curie) [241](~/.pyenv/versions/3.11.7/envs/babelon/lib/python3.11/site-packages/oaklib/interfaces/basic_ontology_interface.py:241) if rv is None and strict: ... http://www.geneontology.org/formats/oboInOwl#: prefix='oio' uri_prefix='http://www.geneontology.org/formats/oboInOwl#' prefix_synonyms=[] uri_prefix_synonyms=[] pattern=None prefix='oboInOwl' uri_prefix='http://www.geneontology.org/formats/oboInOwl#' prefix_synonyms=[] uri_prefix_synonyms=[] pattern=None ```