cidgoh / geem

Genomic Epidemiology Entity Mart
Creative Commons Attribution 4.0 International
3 stars 1 forks source link

Explore the pairing of units, data types and formats #2

Open Public-Health-Bioinformatics opened 6 years ago

Public-Health-Bioinformatics commented 6 years ago

A flexible data entry/reporting system would allow that a given date or numeric datum could be inputted

Currently to enable GEEM to display different units with a specification (or form) field, we have them set up in the ontology via the OBO Foundry 'has measurement unit label' relation:

[entity] 'has measurement unit label' max 1 (day or week or month or year)

And a second subClassOf axiom defines the datatype using GenEpiO's 'has primitive data type' which allows association of a datum with an XML numeric, date or other type, as well as some range constraints (which are translated from OWL into a JSON representation):

'has primitive data type' exactly 1 xsd:nonNegativeInteger[< "130"^^xsd:nonNegativeInteger]

A first question is whether such a specification engenders that any stored value be stored in the most granular unit. This is problematic for time ranges insofar as 1 month = 4 and a bit weeks; so for some scales this approach would require averaging of months in a year, leap year amounts, etc.

Secondly, whether the range constraint should always be given in the smallest unit possible, so that it can be automatically calculated for the other scales?

A test case is "Drug MIC", which has two units of measure, "mg/L" and variant "ug/mL which however entail different precision, and "mm" millimetre.

'has measurement unit label' exactly 1 (millimeter or 'milligram per liter' or 'microgram per milliliter')

Possibility allow each datum to be accompanied by a 'has value specification' which contains both unit and numeric/string datatype constraints? This case may highlight an underlying difference between the type of measurement (diameter vs solution density) that needs to be separated out for easier data analysis.

ddooley commented 4 years ago

Circling back to this now, in light of new OBI value specification data structure. Tricky part is how to specify numeric ranges when user can select different units. Probably have to key numeric range to just ONE unit, and calculate range, precision based on that.