Open tw-osthus opened 6 months ago
@ElisaKendall What do you think?
@tw-osthus Hmmm ... I have a meeting with the Commons / MVF working group in a couple of hours - I could take a look first and then talk with them about relaxing some of the constraints, especially the hasURI, as one option. Would changing that from exact to someValuesFrom work? The mapping challenge is a new use case for MVF, and I think other vocabularies will have similar issues. But in thinking about it, perhaps owl:sameAs is the wrong relationship. It should probably be a mapping from one vocabulary to another. Then the unique URI would be correct.
I need to think about term to name vs. term to term -- that would be ok using the current approach to controlled vocabularies, but the semantics would need to be correct. Your judgement on this is likely right, but we should walk through an example next week.
@mereolog @tw-osthus These three restrictions are now existential (some values from) rather than exactly 1 as of pull request #584. That still may mean that some of these kinds of errors occur. The right approach for aligning individuals in most cases will be a mapping rather than use of owl:sameAs
mvf-trm:Term has 3 string data properties which are defined to be of exactly 1 cardinality:
mvf:hasTextualName mvf:hasURI cmns-dsg:hasDescription
Currently SPOR controlled vocabulary term, SPOR term name, SPOR list name SPOR names, SNOMED CT term (but not MedDRA terms see github-580) are subclasses of mvf-trm:Term
We do not define the required data properties, so hygiene will complain that they are incompletely defined. I had used mvf:hasTextualName instead of its super property cmns-txt:hasTextValue to refer to the label. When I do that and make MedDRA term also a subclass of mvf-trm:Term, then our examples are inconsistent, because the cardinality constraint is violated because there are same as assertions between SPOR and MedDRA terms where SPOR just re-publishes MedDRA
for example in Amlodipine Stable Angina Pectoris is referenced as
Here the SPOR c.t. term is mapped to the MedDRA LLT. It is more correct to map the SPOR term name to the MedDRA LLT.
If CTCAE terms which are MedDRA terms and are already subclass of mvf-trm:Term were also republished as SPOR terms, then this would have shown up earlier.
Either owl:sameAs is too strong for mapping here, and we have to use a weaker mapping like skos:exactMatch or we have to reconsider subclassing from MVF term.
Besides we have to assert the other required dataproperties as well if we want to avoid hygiene test warnings. Keep in mind, that the description is language dependent in SPOR and there is also no globally unique single official URI to use for mvf:hasURI.
For OWL, because of open world assumption, we are ok as long as we do NOT assert a label (as it has been so far in our example). Once one label has been assigned, any new one, even one that is introduced indirectly via owl:sameAs will lead to an inconsistent model.
This is not a stable ontology design. We cannot weaken the MVF constraints, so the consequence will be not to use owl:sameAs mapping between technically identical terms.
In the SPOR case, we may map the SPOR Term name(english) to MedDRA LLT by owl:sameAs, but we have to ensure, that the literal string we use for mvf:hasTextualName is 100% identical.