mapping-commons / sssom-py

Python toolkit for SSSOM mapping format
https://mapping-commons.github.io/sssom-py/index.html#
MIT License
49 stars 12 forks source link

Extended prefix map support #496

Open joeflack4 opened 7 months ago

joeflack4 commented 7 months ago

Overview

There are namespace situations that simply cannot be handled by a bimap. For these situations (e.g. prefix or uri_prefix synonyms), we need extended prefix map (EPM) support.

Context

We've already experienced issues where lack of this support caused issues:

matentzn commented 7 months ago

We need a clear driving usecase for this. I think the parse command absolutely needs to support EPMs, perhaps the merge command as well, but other than that, I am not convinced. For publishing sssom files, we certainly need the good old unambiguous bimap.

joeflack4 commented 7 months ago

Aren't synonyms a clear enough use case? What do people do when they have synonyms? Would we ask them to merge them in preprocessing?

matentzn commented 7 months ago

Aren't synonyms a clear enough use case? What do people do when they have synonyms? Would we ask them to merge them in preprocessing?

Can you give an example? Unfortunately the word synonym has too many meanings. Are you talking about the case that the same user wants to use ICD10:Q10 and ICD10CM:Q10, and both refer to the same URI?

joeflack4 commented 7 months ago

I mentioned in the OP about accounting for either prefix or URI prefix synonyms. Your example is of a prefix synonym. But yes, that is the primary case I am concerned about / think will come up more often.

matentzn commented 7 months ago

My argument against this is that there are few reasonable scenarios where synonyms should be allowed. I would need to see a real case (other than merge and parse), for publishing a mapping set with synonyms.. I can't think right now of any good reason to do it.

joeflack4 commented 7 months ago

Hmm I see. Well if that's the case, then I'm out of ideas and you could probably close this issue. I'm just surprised that we would advocate for EPMs generally, in OAK, etc, but can't find a use case for it in SSSOM!

matentzn commented 7 months ago

I am not dying to be dismissive or anything - if you have a concrete use case describe it and we will talk about it. Any scenario that in involves merging data from multiple sources needs EPMs, because there may be inconsistent use of IRIs and CURIE prefixes. Hence their need in sssom parse and oak/semsql (where you may interact with merged ontologies).