mapping-commons / semantic-mapping-vocabulary

https://mapping-commons.github.io/semantic-mapping-vocabulary/
10 stars 3 forks source link

Proposal for a new classification of mapping relations #3

Closed matentzn closed 1 year ago

matentzn commented 2 years ago

This is the initial draft for a better classification of semantic mapping relations. It is based on the skos classification of mapping relations, and extends it to allow for additional kinds of mapping relationships such as cross-species mappings and potentially other kinds of conflation relations (gene-reference, disease-phenotype protein, etc). It has been often noted that we are over-using skos:exactMatch massively for isomorphic concepts, and a new vocabulary of mapping relations, conservatively evolved (to avoid proliferation), will not only allow us to cater for these use cases, but also avoid watering down exactMatch further. In order to group all kinds of isomorphic match properties (definition below) we introduce a new mapping relationship semapv:isomorphicMatch, which groups relationships like skos:exactMatch / semapv:crossSpeciesExactMatch under one parent. semapv:isomorphicMatch is not (at least not as part of the intended use case) here supposed to be used as a mapping predicate, just to group other mapping predicates.

cmungall commented 2 years ago

Proposed skeleton of the mathematical mapping predicates:

skos:exactMatch should be 1:1, but there may be 1:1 matches that are not exact matches (or even in the skos hierarchy). E.g. cross-species exact/broad/narrow/related analog matches

matentzn commented 2 years ago

@cthoyt notes the fact that as it stands, semapv:crossSpeciesExactMatch is isomorphic and non-isomorphic at the same time. That is a bit of a weakness in this proposal, but maybe not. Not sure whats best here (in my eyes crossSpeciesExact is both isomorphic and skos:relatedMatch, which is non-isomorphic, but could be considered very confusing).

sabrinatoro commented 2 years ago

@matentzn Could we look at concrete examples, please? There is probably already be a document with these; if so could you please share it? Thank you.

matentzn commented 2 years ago

There are no docs yet about this at all - what exactly would you like to see examples for? The cross species mappings are well described in the linked OMO issue, and the isomorphicMatch relationship should not actually be instantiated, so you can't really give a good example?

gouttegd commented 2 years ago

Here are some examples of how I would (tentatively) use those relations, the way I understand them, for my primary use case of mappings between Uberon/CL and FBbt (of note, I do indeed find confusing that crossSpeciesExactMatch is both isomorphic and non-isomorphic).

I don’t have any example readily available of a narrow match. I don’t actually think there will ever be one in the case of FBbt-to-Uberon/CL mappings.

I note that there is another way to interpret the exact/broad/narrow distinctions: they could be said to apply to the taxonomic level at which we compare, rather than to the anatomical structure or cell type. For example, with that interpretation:

I don’t think this interpretation would be much useful. All mappings between Uberon/CL and any taxon-specific ontology such as FBbt would end up being broad matches, without any more possibility for nuance.

dosumis commented 1 year ago

I feel like these discussions are losing sight of the important aim of being able to generate OWL bridge files, e.g. between Uberon & FBbt; CL & FBbt. These have traditionally been interpreted as EquivalentTo X and in_taxon some Y. It's hard to see how we get from generic cross-species AP predicates to this, except by some form of hackery. One of the advantages of SSSOM is that we can use it to generate OWL. Would it be stretching the formalism too far to find a way to support compound predicates of EquivalentTo + Taxon ID that get translated to the desired OWL?

gouttegd commented 1 year ago

It's hard to see how we get from generic cross-species AP predicates to this, except by some form of hackery.

I was hoping that once an agreement has (finally) been reached as to what predicate to use, we could then add support (probably in sssom-py) for a “translation“ mechanism to generate the actual OWL bridging axioms.

Would it be stretching the formalism too far to find a way to support compound predicates of EquivalentTo + Taxon ID that get translated to the desired OWL?

I don’t know it would stretch the formalism. My concern with such an approach is that it would make it more difficult to re-use the mapping set for another use case than just generating the bridges, should anyone be wanting to do that.

That being said, the use case we have right now is generating the bridges (I am not aware of anyone interested in using the FBbt/Uberon/CL mappings to do anything else), so I’d support any move going in that direction – if it reduces the usefulness and the re-usability of the SSSOM mapping set, that’s just too bad but I can live with it.

matentzn commented 1 year ago

Lets go ahead with this. @gouttegd whenever you get a chance, can you make a PR to add the relations? I would suggest to create a new file copied from https://github.com/mapping-commons/semantic-mapping-vocabulary/blob/main/semapv-terms.tsv, called semapv-properties.tsv, and add all the relations suggested in this comment to it. The simply extend the makefile to compile the file into OWL and merge it into the main release file.

matentzn commented 1 year ago

This has nothing to do with the PR, but I believe that species-specific to species-neutral use cases should resort to the skos standard mapping vocabulary (skos:broadMatch instead of semapv:crossSpeciesBroadMatch).

gouttegd commented 1 year ago

You realise that “species-specific to species-neutral” is THE use case I have been requesting a specific relation for since the very beginning, right? It’s for the mappings between FBbt (Drosophila-specific) and Uberon/CL (species-neutral). I thought we agreed on that. I’ve stated since July last year (my comment above) that I would use the crossSpecies*Match for that use case, and nobody said anything against that.

My PR (#14) explicitly uses “species-specific to species-neutral” cases as examples for the semapv:crossSpeciesExactMatch and semapv:crossSpeciesBroadMatch, and you have not commented on that in your review. So, which way it is, then?

matentzn commented 1 year ago

Let's at least decouple the debate of the PR on the best way to apply the relations. Both paths are justifiable. I would argue that a term that is truly taxon independent is a literal parent of their species specific counterparts, so there may not be a need to introduce special vocabulary for this. This is not the case for fly eye to human eye. Here we really need a special vocabulary.

Let's keep this a bit open for now. Both paths are possible, and I am happy to hear your arguments over a beer in Padua.

gouttegd commented 1 year ago

I don’t understand.

One of the main arguments against the introduction of new relations for cross-species mappings was that the risk of “mapping predicate proliferation” – that is, creating a precedent that would trigger the creation of way too many mapping predicates.

And now you’re seemingly willing to go ahead with the creation of new mapping predicates before we even agree on what those predicates should be used for? So we could create the predicate and later finally decide that, well no, in the end we won’t use those, let’s use the pre-existing SKOS ones instead?

I would argue that a term that is truly taxon independent is a literal parent of their species specific counterparts, so there may not be a need to introduce special vocabulary for this.

First, this may be true only for “exact” matches (e.g. FBbt’s Drosophila “muscle cell” to CL’s taxon-neutral “muscle cell”). This is not applicable for not-so-clear-cut mappings (the ones for which I was planning to use semapv:crossSpeciesBroadMatch). For example, I will not state that FBbt’s “subperineurial glial sheath” is a subclass of Uberon’s taxon-neutral “blood brain barrier”. One benefits of the specific relations was precisely to add the possibility to create “nuanced” mappings, something that the current system (based on oboInOwl:hasDbXref) does not allow, and that would not be allowed either under your proposal.

Second, even if we decide that cross-species mappings should be limited to the exact cases (so, we do not try to represent “not-so-clear-cut” mappings), not using a specific relation for them would lead to the same issue than we currently have with oboInOwl:hasDbXref-based mappings. That is, the fact that a given mapping is a cross-species mapping is lost. All that is visible is the fact that a mapping is there; it’s left to the consumers to guess what kind of mapping it is, based on whatever contextual information is available (like, “oh, that’s a mapping between a FBbt term and a Uberon term? must be a cross-species mapping then”). I thought we agreed this was not great.

Let's keep this a bit open for now.

Fine. But then FYI, I am considering switching the generation of the FBbt-to-Uberon/CL bridges to a (hopefully temporary) in-house, ad-hoc system that will bypass SSSOM completely. I can’t stand the current system anymore and I’ve been waiting for months to replace it with a generic, SSSOM-based mechanism ideally directly built-in within the ODK – something that was dependent on having agreed-upon relations to represent cross-species mappings. Clearly there is no agreement on how to represent those, and I am not sure I want to wait until there is one.

I am happy to hear your arguments over a beer in Padua.

I think I’ve presented my arguments many times already. If you’re not convinced yet, not sure what else I can do — apart maybe from getting you so drunk that you’d accept anything. :p

matentzn commented 1 year ago

Alright, this is now done. Thanks for your driving on this @gouttegd