biopragmatics / curies

🐸 Idiomatic conversion between URIs and compact URIs (CURIEs) in Python
https://curies.readthedocs.io
MIT License
21 stars 6 forks source link

Upgrading (non-bijective) prefix maps #99

Closed cthoyt closed 9 months ago

cthoyt commented 9 months ago

Motivated by https://github.com/mapping-commons/sssom-py/pull/485. Demo:

from curies import Converter, upgrade_prefix_map
pm = {"a": "https://example.com/a/", "b": "https://example.com/a/"}
records = upgrade_prefix_map(pm)
converter = Converter(records)

>>> converter.expand("a:1")
'https://example.com/a/1'

>>> converter.expand("b:1")
'https://example.com/a/1'

>>> converter.compress("https://example.com/a/1")
'a:1'

This function is for people who are not in the position to make the sustainable fix, and want to automate the assignment of which is the preferred prefix. It uses a deterministic algorithm to choose from two or more CURIE prefixes that have the same URI prefix and generate an extended prefix map in which they have bene collapsed into a single record. More specitically, the algorithm is based on a case-sensitive lexical sort of the prefixes. The first in the sort order becomes the primary prefix and the others become synonyms in the resulting record.


cc @joeflack4