bio-ontology-research-group / unit-ontology

An ontology of units of measurements
Creative Commons Attribution 4.0 International
20 stars 13 forks source link

Refactor UO using ODK #34

Closed kaiiam closed 1 year ago

kaiiam commented 4 years ago

As discussed in #33 and it email correspondences with @leechuck @reality and @cmungall, UO could benefit from being restructured to a standard OBO directory structure and makefile workflow by running the ontology development kit to generate the missing files and merging them in. I will try to get to this in the next couple weeks as a pull request.

@reality I don't want to disrupt the existing groovy workflow so once I've got a branch with the new files we'll have to make sure that gets in there, perhaps as hooks in the makefile.

kaiiam commented 3 years ago

@reality @leechuck see my proposal for revamping UO: 1) presentation, and 2) accompanying github UO_revamp repo.

leechuck commented 3 years ago

I like the proposal and think it will improve overall quality and interoperability of UO. The somewhat non-standard processing based on the groovy scripts would be good to keep as we do not currently know who is using which of the UO versions. @reality what you think?

kaiiam commented 3 years ago

Thanks @leechuck

The somewhat non-standard processing based on the groovy scripts would be good to keep as we do not currently know who is using which of the UO versions.

I think this is fine for maintaining the punning and additional files if that's desired. I don't think that we should use it to generate design patterns e.g. the EQ axioms, I'd propose that be done in the python/robot workflow. I'm also unclear if it's better to have terms like liter based unit with prefixed subclass milliliter etc, or just to have liter be the super class.

I've been experimenting around with some potential different design patterns using EQ's to more term placement from inference. I'll make some slides to demonstrate it. Perhaps at some point we could schedule a meeting to discuss all this?

kaiiam commented 3 years ago

Hey @leechuck @reality @cmungall and @jamesaoverton as promised see this presentation about some proposed EQ patterns to help infer auto-generated combinatorial classes. Please let me know what you all think. @dr-shorthair if you wouldn't mind I'd love to get your comments as well.

My thinking is that if if/once we were to settle on some EQ design pattern like this, then we can do the following:

1) cross the big list of named units, liter metre, ampere etc with the metric prefixes to generate all prefix base pairings. ml, ul, mA, uA etc.

2) Program as many of the top levels as possible with EQ axioms e.g. molar volume unit = EQ: unit and 'has base unit' some (volume unit)/(mole)

3) Create a python script that given input pairings instructions e.g. cubic centimeter and per mole compositionally builds all robot template columns for that row of a robot template including a programatic EQ such as the one proposed in the above slide deck. Then we reason the whole thing and let the unit combo terms get inferred to their appropriate super class. Building such a workflow with a script to auto-generate robot template inputs and then reasoning would allow newly requested combinatorial terms which fits one of the top level patterns get inferred to the right place. This way we don't have to assert combos manually, instead we infer them.

4) Finally we could bring in the QUDT/OM combinatorial terms by text mining them into strings resembling the input for the script described in 3). Then set the script loose to generate those terms. Finally run my existing UO to QUDT/OM mapping scripts with all the new UO combo terms (mined from QUDT/OM) and auto map it all.

dr-shorthair commented 3 years ago

Consider using the UCUM codes for correlation with QUDT - they are included in https://github.com/qudt/qudt-public-repo/blob/master/vocab/unit/VOCAB_QUDT-UNITS-ALL-v2.1.ttl for pretty much all of the QUDT individuals that could have UCUM codes.

I have also prepared a comparable set for OM - see https://github.com/HajoRijgersberg/OM/pull/44

dr-shorthair commented 3 years ago

Regarding image

The usual way to read this is 'every individual from the class micrometer is also an individual from the class meter'. That is a bit surprising, at first glance at least. Is there a way to say it in natural language that does not appear to be incorrect?

kaiiam commented 3 years ago

@dr-shorthair thanks for the quick feedback

Consider using the UCUM codes for correlation with QUDT

Yes absolutely, and good to see such a UCUM PR for OM!

The usual way to read this is 'every individual from the class micrometer is also an individual from the class meter'.

Interesting, maybe it depends on how one sees representing units as individuals like OM/QUDT or as classes like UO/OBOE.

UO defines has millimeter as A length unit which is equal to one thousandth of a meter or 10^[-3] m. and QUDT as: ... a unit of length in the metric system, equal to one thousandth of a metre .... So following a genus differentia definition structure for a super/subclass relation millimeter would be it's a meter which is subdivided by 1000. So I guess in natural language could we couldn't we say "every individual millimeter is a meter which has been subdivided by 1000?". Doesn't seem to wierd to me but maybe I'm missing something...

kaiiam commented 3 years ago

cc @HajoRijgersberg and xref https://github.com/kaiiam/UO_revamp/issues/1

HajoRijgersberg commented 3 years ago

Thanks for this interesting thread! Hope to study it soon! :)

reality commented 1 year ago

The above suggestions for axioms, mappings, etc, sound interesting and beneficial. First, however, I would like to solve the issue of refactoring to ODK build of the current state. This will provide a good basis for development + mean we can solve the currently blocked issues. Please feel free to submit the above to /(a )?separate issue(s?)/.

as above commits contain, I have created an initial ODK instantiation in the odk branch. I have dropped the current .owl product file into the -edit of a fresh ODK, so we maintain the dynamically added classes/axioms, etc. Since the ontology won't change much, I figure we can probably manage it manually using the edit file. If not, then we can hook something into the build script, or maybe decompose the OWL file into a template (although this also introduces pain because you can no longer use Protege)

@leechuck could you please confirm you're okay with me moving ODK to main when ready? Also if you can note anything I should be aware of.

noting also related issues [#63] [#61] [#51] [#47] [#46] [#29] [#17]

kaiiam commented 1 year ago

cool glad you got to doing this @reality.

On a seperate note, while making the https://units-of-measurement.org project we made some UO to UCUM mappings if you guys what to incorporate those into UO, happy to share those. I'll need to be updated a bit to reflect more recent UO terms.

reality commented 1 year ago

Yes, that would be great - the more mappings, the better!

kaiiam commented 1 year ago

They are currently hosted in https://github.com/units-of-measurement/units-of-measurement/blob/main/units_of_measurement/resources/mappings.csv, but we might convert it into separate SSSOM files for each project.

reality commented 1 year ago

this has now been completed, the live artefact is based on an odk build

kaiiam commented 1 year ago

cross reference https://github.com/OBOFoundry/purl.obolibrary.org/issues/919 incase this is still and issue.