kaiiam / UO_revamp

2 stars 0 forks source link

unit classes #9

Open kaiiam opened 3 years ago

kaiiam commented 3 years ago

As discussed in our meeting yesterday the 26th (see notes here), it would be good to sort out what classes will be included in the new ontology. If I understood @dr-shorthair's suggestion, we could go for a minimalist approach (or starting place) with just three classes: 1) base SI units, 2) derived units and 3) conventional (non SI units).

  1. As I believe @HajoRijgersberg suggested in the notes 1) could be broken down into the true SI base units (e.g., Newton or metre) and the prefixed units (e.g. kilometre).

  2. Derived units could either refer to all unit combinations including both or mixes of SI and non SI units, or could be split into SI only combos (e.g., Kg.m-3) and mixed combos (e.g., [in_US].s-1). Do we want both? Or would only any derived combos suffice? @HajoRijgersberg has also suggested additional grouping categories, unit multiplication, unit division, and unit exponentiation.

  3. Conventional units, there are lots of other non SI systems Avoirdupois, Imperial, Apothecaries' units etc. I'm happy lumping them all into the "other" bucket that @dr-shorthair suggested we call conventional. Is that reasonable? Alternatively we have grouping classes for each. This could also be broken down into non SI units that can and can't be crossed with metric prefixes, see https://github.com/kaiiam/UO_revamp/issues/7.

Additionally, other systems e.g., OM and UO have top level unit classes or quantity-specific unit classes, such as length unit, mass unit, time unit, etc. Reusing these and connecting back to OM and UO/PATO's classes was an earlier topic of discussion which remains unresolved if that'll be a future direction. Xref https://github.com/OBOFoundry/COB/issues/35.

@jamesaoverton @dr-shorthair @HajoRijgersberg @ddooley @graybeal @cmungall @zhengj2007 thoughts?

HajoRijgersberg commented 3 years ago

Thanx Kai, for all your effort! :) I think the approach with three classes is not adequate. Base and derived units depend on the system of units, so the SI has other base units and derived units that a cgs system. So, it should be organized differently. E.g. class 2) derived units will otherwise contain units that are also in 1) base SI units, since derived units in the one system may be base units in the other.

Ad 1: prefixed units are not opposed to SI base units. Kilogram is an SI base unit and it is a prefixed unit. Newton is not an SI base unit, it is an SI derived unit (with a special name).

Ad 2: Not sure if a compound unit that is a mix of SI units and non-SI units can be a derived unit in the SI or any other system of units. Compound units and derived units are easily confused. Compound units are compositions of units, such as kg/in3 (crazy example); derived units are units composed of base units within a system of units, e.g. kg/m3 in the SI. Unit multiplication, unit division, and unit exponentiation are for defining the operands of a compound unit, so that unit conversion can be automated. I think it is too early now to tackle that.

Ad 3: I think every system of units should be kind of independent, not put together in one bucket.

Quantity-specific classes (such as length unit) are for indicating which units can be converted to one another.

So, I think you should reconsider and reformulate your comment... Sorry! But hope to help! :)

graybeal commented 3 years ago

Now I'm not sure if any of my ideas are useful.

I was thinking of it compositionally: prefixes (but maybe you won't have those components in the list); base units; everything else under either the term compound or the term derived (having never come across that rather subtle definition of a derived unit).

And I hadn't thought of slipping the word SI in there anywhere (sorry @HajoRijgersberg). As I mentioned earlier, I never imagined this could be the authoritative SI unit vocabulary; inclusion is most important to me, to maximize adoption and usability.

but yes, the original composition mostly works. Could 'basic' units be in two subclasses, 'with prefix' and 'without prefix'? could instances be in multiple classes, so kilogram is a basic with prefix unit and also a 'base SI' unit?

Having classes to show commonality/convertibility is also useful (length, distance/time, etc.) but perhaps this is an extension/knowledge not needed for the first release?

kaiiam commented 3 years ago

base units; everything else under either the term compound or the term derived

@graybeal I like your suggestion at least for a first pass

I never imagined this could be the authoritative SI unit vocabulary; inclusion is most important to me, to maximize adoption and usability.

💯 agreement. we're not making the SI's system were making pragmatic FAIR units on the web.

Could 'basic' units be in two subclasses, 'with prefix' and 'without prefix'? could instances be in multiple classes, so kilogram is a basic with prefix unit and also a 'base SI' unit?

Good points

Having classes to show commonality/convertibility is also useful (length, distance/time, etc.) but perhaps this is an extension/knowledge not needed for the first release?

Yes cool future direction but I don't see it in scope for V1.0

Here is a first pass idea which would be aligned with the python parser used to ingest the units. Yes its not perfect but I want it to be practical and aligned with the parser categories.

base unit
   metric base unit (Newton, metre, gram* )
   conventional base unit ([in_us])
prefixed unit
   prefixed metric unit (millimole)
   prefixed conventional unit  (nanocurie)
derived unit
   metric derived unit (g.L-1)
   conventional derived unit ([in_i].s-1)

We could also have only have the following

base unit
   metric base unit 
   conventional base unit 
prefixed unit
derived unit

and assign 2 terms for the other categories e.g. instead of prefixed metric unit just assign prefixed unit and metric base unit

Yes I understand these don't strictly follow the SI and instead the pragmatic parser rules. I know that in the SI there would be the 7 SI base units, 22 names special units and the accepted for use units. If we feel the need to implement that we could have separate classes for those and assign them as well. But I see that like the length, distance/time, etc units as a future direction.

HajoRijgersberg commented 3 years ago

I never imagined this could be the authoritative SI unit vocabulary

John, I could react to that too. Let me know if you want that.

kaiiam commented 3 years ago

I never imagined this could be the authoritative SI unit vocabulary

We're simply not that, we can add the additional classes like special SI named unit etc to try and do our best to represent the SI but I don't see those types of class as being the core to the system. I see them as extras.