Open cmungall opened 2 years ago
Great issue. If we allow the semantics to be done on a mapping by mapping basis, we will hamper integration because of the potential for encoding conflicting semantics. I would suggest we develop a lightweight resource which obtains its semantics directly from RO, and records those for owl, rdfs and skos as they are permanent.
This would also help to simply build a resource of types: right now, we need to guess wether something is an AP or OP. Is there any way we can do that in linkml, or should we define this as a separate document?
I think use case 1 is already covered by the SSSOM as this by distinguishing subject / object, a developer can get only the mappings where A is the subject.
I believe use case 2 and 3 must ultimately rely on what is declared/formalized by the vocabulary defining the property. I would expect an external person getting a SSSOM file to resolve the predicate URI and see the definition. Indeed, I would find relevant a documentation page in the SSSOM web site providing a "recall" on the symmetry and transitive characteristics of the properties. This could be also a place to write sentence like : "We recommend to use this predicate when ... ". Such a table would also be a good spot to recall domain and range of the predicated (when they exist) so that a developer picking up knows everything. Will developper explicitly take care of all the aspect is another story beyond the SSSOM current objective maybe ?
Some parts of this are addressed by subject_type
, predicate_type
and object_type
(optional parameters to say if an entity is a class, an annotation property etc). This allows us to make some nice assumptions during sssom TSV processing in cases where it is not straightforward to look up stuff (skos, owl spec).
Another part of this could be addressed by #79 - but if we do, it will be fairly minimal. Like @jonquet I favour "looking up" semantics, but not sure how practical that is in all cases.
the sssom schema officially allows predicate_id to be any CURIE, although there is a subset we recommend. Semantics of those predicates may be declared externally; for example, in
use case 1:
an application developer who wants to know what to display in search results on a search for A, where they have mappings A R1 B, and B R2 A, and they want to show "from A's perspective"
use case 2:
an application developer wants to know if two mappings A R1 B and B R2 A (with identical additional metdata) are redundant
use case 3
an application developer has A R1 B, B R2 C, and wants to compute the relationship between A and C
in all cases, this could be approached by hardcoding but this is not ideal
A lightweight suggestion is to publish an additional yaml file or owl alongside the standard that application developers can use, this would contain the most common predicates. This would include basic logical characteristics such as symmetry, transitivity, inverses, chains. The usual caveats apply about these being for abox level inference (ie if a mapping is interpreted as all-some then the inverse inference doesn't hold). The file should be seen as a helper rather than formal reasoning artefact
alternatively we could allow semantics to be specified redundantly on a mapping by mapping basis.