RDFBones / RDFBones-O

An RDF ontology for research data from physical anthropology and related fields of expertise.
4 stars 1 forks source link

Reuse of category labels in separate ontology extensions #143

Open cuboideum opened 3 years ago

cuboideum commented 3 years ago

Introduction

Categorical measurement data need a defined set of states that they can take. In the OBI, these states are defined by labels, i.e. instances of class 'categorical label' (obo:OBI_0000976). Which label instances can be used is defined by restrictions on the measurement datum class. For example, the measurement datum class 'handedness categorical measurement datum' (obo:OBI_0000976) has the following restriction:

'has category label' some ({'right handed' , ambidexterous , 'left handed'})

'right handed', 'ambidexterous' and 'left handed' are all instances of class 'categorical label'.

Problem Statement

Data standards in biological anthropology frequently contain categorical measurement data. Typically, their states ('labels') are codes containing of a numerical ID to be stored in database systems and a label characterising their meaning. As an example, here are two coding schemes to localise observations on bone organs:

RDFBones-ReuseOfCategoryLabels-Initial

Both measurement data have a different range of labels that overlap in four instances where two labels have the exact same meaning:

RDFBones-ReuseOfCategoryLabels-sameAs

Going by the OBI scheme, the (later) Phaleron measurement datum would reuse the label instances created by the (earlier) Standards measurement datum and add the other labels it requires to the overall pool of categorical labels. Both measurement data would have class restrictions defining which labels are accepted.

This, however, does not work with osteological categories defined as codes. Although the equivalent pairs have the same meaning they differ in the numerical codes assigned to them. As the ontologies might be needed to decode existing data in numerical form, both assignments, label and numerical code, need to be unambiguous. Therefore, the labels are not exactly the same and cannot be marked as such using the owl:sameAs as suggested above.

Discussion

At the work group meeting on 03 August 2021 the problem was discussed. Reusing of labels as prescribed by the OBI seems not to be feasible. As a consequence, it was decided to model all sets of measurement data separately for now. Compatibility between labels should be indicated with a property other than owl:sameAs:

RDFBones-ReuseOfCategoryLabels-Solution

Which property can be used or if a new one needs to be introduced shall be decided at a later stage.