mapping-commons / semantic-mapping-vocabulary

https://mapping-commons.github.io/semantic-mapping-vocabulary/
10 stars 3 forks source link

Add semapv:MappingInversion #22

Closed matentzn closed 1 year ago

matentzn commented 1 year ago

Analogous to semapv:MappingChaining we introduce "mapping flipping-based matching process", which is defined as:

A matching process based on the reversing or flipping of the subject with the object of a mapping in accordance with the semantics of the mapping predicate.

matentzn commented 1 year ago

In #23 @cthoyt was asking

Should there be different justifications for reflective inversions (A skos:exactMatch B inverts to B skos:exactMatch A) and non-reflexive inversions (A skos:broaderMatch B inverts to B skos:narrowMatch A)?

https://github.com/mapping-commons/semantic-mapping-vocabulary/pull/23#issuecomment-1451561581

My personal sense is that we should keep the number of different justifications low (so trade simplicity for expressiveness). We could introduce sublcasses of the current more general class though if we decide it is needed?

gouttegd commented 1 year ago

So is the idea that this new semapv:MappingInversion should be used as the "mapping justification" for any mapping that has been obtained by automatic inversion?

That is, if we have, say, a mapping where Subject is a skos:narrowMatch of Object and that is justified by a semapv:ManualCuration, the "inverse" mapping (Object is a skos:broadMatch of Subject) should have a justification of semapv:MappingInversion?

If so, I am concerned by the loss of information of the process, unless there is a way to add to the "reverse" mapping some kind of link to the "original" mapping with the original justification.

matentzn commented 1 year ago

@gouttegd we thought about this and in the inversion process of our toolkit we:

  1. invert all metadata in the justification
  2. Set a "mapping_source" property which links to the mapping set the mapping pre-inversion comes from

Unfortunately the SSSOM model does not currently allow "mapping identifiers", as they add a layer of complex identifier management to the otherwise simple model. This inability to pinpoint exactly which mapping was flipped is annoying, see https://github.com/mapping-commons/sssom/issues/231, but I think stating clearly where the flipped mapping came from is an OK compromise. What do you think?

gouttegd commented 1 year ago

This mapping_source property would take the form of a URI pointing to the original mapping set?

OK, so let's imagine the following: Let's say several taxon-specific ontologies provide mapping sets with Uberon/CL (as FBbt is currently doing). They all follow the same convention of using their own terms as "subjects" and the Uberon/CL terms as "objects" (again, as FBbt is currently doing). In Uberon, we collect all those mappings and merge them into a single mapping set. Then, we invert that mapping set, because we want the Uberon terms to be the subjects. What should the mapping_source property be set to?

It can't be set to the original mapping sets, because there are more than one. And the merged mapping set is merely an intermediate in the construction of the final, inverted set, it is not intended to be published anywhere (so there would no public URI to it).

I think changing the mapping justification to indicate the mappings have been inverted is a needless complication. Inversion does not change the meaning of the mappings, so I think it's more important to know the justification for the original mapping (which can be the same for the inverted mapping, since the inverted mapping is merely a different way of stating the same thing) than to know that the mapping had been inverted.

matentzn commented 1 year ago

I think the merge->invert situation is a bit of a corner case. Ideally, you can follow the "derived from" breadcrumbs to understand mapping set sources, and the mapping_source in this case would be the original mapping set (the species specific part).

It can't be set to the original mapping sets, because there are more than one.

Each individual mapping + justification comes from a specific mapping set. That would be documented. The merged set is documented only on mapping set, not on mapping level.

I think changing the mapping justification to indicate the mappings have been inverted is a needless complication.

This is only partially true, in a world in which we assume that the semantics of the mapping property are perfect. All that said, I think we do not need to decide here whether a flip has to be documented using a specific justification. I think a mapping provider may chose to do exactly as you say, and simply preserve the justification as it was applying a flip based on the standard semantics. I would say we do not need to be prescriptive here - if someone wants to document a flip, they should be able to do so - but on a case by case basis, we may decide not to!

gouttegd commented 1 year ago

I think the merge-invert situation is a bit of a corner case.

Someone's corner case is someone else's normal case. :) And how a standard deals with corner cases is a good indicator of that standard's robustness and usefulness.

and the mapping_source in this case would be the original mapping set (the species specific part)

Right, my mistake here, I thought mapping_source was a property of the mapping set, I didn't realise it was a property of individual mappings.

I think a mapping provider may chose to do exactly as you say, and simply preserve the justification as it was applying a flip based on the standard semantics.

OK, good for me.

I would say we do not need to be prescriptive here.

For the record, I think the opposite. :) I believe we should be prescriptive.

Letting people deciding for themselves the meaning they assign to mapping properties (for example, to decide whether properties can be inverted without any loss of information -- and so whether mappings can be inverted without a need to change the justification) is IMHO a bad idea. In my experience (admittedly in a completely different field), giving too much flexibility in how a standard should be interpreted is a great way to reduce a standard's usefulness, in a very pernicious manner: superficially, it can lead to more people using the standard (giving an impression of success), but with always a slightly different meaning, so that in the end you do not have the interoperability you were looking for.

Standard mapping predicates (e.g. the SKOS ones) have a pretty well defined meaning, and people should be encouraged to only use them with their intended meaning. If they have a different meaning in mind, they should find more suitable predicates (creating ad-hoc predicates if needed, or asking for such predicates to be created).

matentzn commented 1 year ago

I would say we do not need to be prescriptive here.

What I meant was: we do not have to be prescriptive which justification is recorded.. We should 100% be prescriptive in how the mapping predicates are interpreted!