ucum-org / ucum

https://ucum.org
Other
50 stars 10 forks source link

Equivalence of "l" and "L" representations? #143

Open timbrisc opened 7 years ago

timbrisc commented 7 years ago

Issue migrated from trac ticket # 195

component: help | priority: major

2017-03-13 13:37:44: garret@globalmentor.com created the issue


I'm confused about how the UCUM defines the symbol for "liter". Yes I'm aware that historically the symbol l has been used, and that more recently L has been added by standards bodies (see e.g. BIPM SI 8th Edition) as an alternative to l. But I thought the UCUM was supposed to give us a single, unambiguous set of symbols for interchange.

But reading UCUM more closely, I see that it provides both case-sensitive and case-insensitive versions of symbols. Moreover I see that, for case-sensitivity, "liter" is defined twice, once with a case-sensitive symbol of l and another with a case-sensitive symbol of L. So the way I interpret this is that, if you're in a case-sensitive environment, there are actually two liter symbols, l and L, and they both mean the same thing (effectively making the symbol case-insensitive---sheesh!).

So if I'm interpreting this correctly, it means that if a program supports UCUM, and even if it does so in a case-sensitive manner, a UCUM program must always interpret l and L as synonyms, including derived units such as ml/mL. Is that a correct interpretation? Is UCUM forcing us to do equivalency lookups for certain symbols?

But then I communicated with Werner Keil (see https://github.com/unitsofmeasurement/uom-systems/issues/4#issuecomment-284885795) and he tells me that the UCUM considers l and L to be "two different distinct units" in case-sensitive mode.

Interpreting the UCUM correctly has direct consequences in its implementations. In Java I'm using the JSR 363 RI to serialize units, which produces l for liters. I would prefer a UCUM implementation, but that's all that's available right now. If I use the JSR 363 RI implementation, it will produce loads of data using l instead of L. When my code is finally upgraded to a UCUM implementation, will it consider the currently serialized l data to be equivalent to L, or will it consider the data to be distinct from data using L units?

What does the UCUM specification say should happen? In case-sensitive mode, are l and L (and consequently ml and mL, and all the other derived unit pairs) synonyms for the same unit, requiring that the application keep equivalency looking tables; or are they distinct units (which is worse, because now we have two units in the same system for the same amount, but we can't directly compare them).

timbrisc commented 7 years ago

2017-04-19 20:54:14: gschadow@pragmaticdata.com commented


l and L are synonyms, yes. The Europeans tend to write l and the Americans tend to write L. The standard we were based on allowed both, so we kept with that.

But semantically there is no difference. So, saying that the UCUM considers l and L to be "two different distinct units" in case-sensitive mode. is really not quite true. The semantic equality is l # L including in all derived units. mg/lmg/L, l/min = L/min, etc.

At the semantic level, 1 L or 1 l are both 10^-3^ m2.

The equivalence of units is determined by implementing the semantics of the factor and vector of exponents. Then you know they are the same. Because remember that 1 cm3 = 1 mL as well.

timbrisc commented 7 years ago

2017-04-19 21:04:48: garret@globalmentor.com commented


I'm not sure what you mean by "synonyms". I understand that the units are equivalent, that is, if you use either you get the same quantity of something. But are they really the same unit?

At first I thought that UCUM made L and l the same unit. But then I took look at some of the other UCUM units. In UCUM a "Maxwell" (Mx) is defined as 1 Wb, and a "weber" (Wb) is defined as 1 V.s. But we don't think that Mx and Wb are the same units. They are equivalent in that if I convert 1 Mx to Wb then I get 1 Wb and not 23 Wb. But they are distinct units.

Analogously I'm forced to read the UCUM as saying that if I (in case-sensitive mode) convert 1 L to l then I will get 1 l, but that L and l are nonetheless distinct units.

So would you say that Mx and Wb are "synonyms"? What I'm trying to get at is, are they the same unit? If someone records in a file 23 Mx and I parse that in, can I turn around and save back to the file 23 Wb? Didn't I lose some semantics by switching units? Or are you saying that Mx and Wb are the same unit, and UCUM doesn't care if a processor mixes and matches them on a whim? Because it seems that whatever answer you give for Mx and Wb, you would have to use the same logic for L and l, respectively (and vice-versa).

timbrisc commented 7 years ago

2017-04-19 21:14:18: garret@globalmentor.com commented


Also analogously, think of "stere" (st). This also has an equivalence to liters. In fact in a "bug" we ran into in uom-systems Issue 96 [https://github.com/unitsofmeasurement/uom-systems/issues/96]. The implementation read in uL, but then it printed out nst.

Nanostere is apparently a "synonym" of uL according to your definition. But is it really OK to arbitrarily switch units like that? Surely you would agree that that nst and uL are "distinct units", even though they are "semantically equivalent", don't you? And according to the UCUM definitions, it seems to me that I am forced to conclude that l and L are "distinct units" as well, even though they are "semantically equivalent".

I would be happy if you could show me where my logic went awry, as I don't really like the conclusion. But logically I don't see any other way, unless you try to convince me that Mx and Wb are the same units, and that uL and nst are the same units. (We all agree that they indicate the same quantity for some measurement value; the issue is whether they are distinct units.)

timbrisc commented 7 years ago

2017-04-19 21:25:07: gschadow@pragmaticdata.com commented


Of course a "stere" is nothing else than a cubic meter or a thousand liters. There is no difference. If you can write 1 st = 1000 L in a physical equation and it is correct, then it is correct for UCUM to say that they are equivalent units.

Any further discussion needs to be based on the precise definition of the ideas we are using.

In UCUM we say the 1 mst is equal to a 1 l just like 1 L is equal to 1 l.

The symbolic expressions of these units are different, but the meaning is exactly the same and no further difference should be made. If you like to use "mst" be my guest, while I use "L" and Françoise Bourdon over there may use "l". Through UCUM knowing about the physics of units, we all understand each other.

timbrisc commented 7 years ago

2017-04-19 21:42:01: garret@globalmentor.com commented


First of all thank you for the discussion. And I can understand where you are coming from: you're saying that analogously it doesn't matter if a user records 1234.56 and the system later shows 1.23456e+3 --- they are both the same value, and no one should ever care.

Yet from a practical standpoint it could be less than satisfactory for some users if a doctor entered 3 uL for some medication and later the system displayed 3 nst. This representation could be confusing to humans who, within a particular domain, are used to working with liters but not with steres. It seems to me that distinguishing between distinct units still has some utility. (But perhaps you would say that "display to humans" is outside the scope of the UCUM.)

The Europeans tend to write l and the Americans tend to write L. The standard we were based on allowed both, so we kept with that.

I think that the problem here stems from the decision you indicated above. I had hoped that the UCUM would be a standard that provides a single, unambiguous syntactical representation for each unit. In most cases this is true, but it seems that for "liter" UCUM made a special case and tried to make this symbol case-insensitive even in case-sensitive mode.

As such if a processor wants to remember the units entered by a user, for example, it will need special processing logic to remember that l and L are really different ways to write the same unit (even in case-sensitive mode), while Mx and Wb are really different units and shouldn't be mixed. That is, in order to follow user intention, if we want to give the value back to the user in the same units as the user gave them the processor has to know that l and L are the same (even in case-sensitive mode), but that Mx and Wb are not. This extra logic is a special case for liter and does not apply to any other unit.

timbrisc commented 3 years ago

2021-01-15 19:53:56: mitchbre@regenstrief.org changed status from new to assigned

timbrisc commented 3 years ago

2021-01-15 19:53:56: mitchbre@regenstrief.org set owner to Simon Cox

timbrisc commented 3 years ago

2021-01-16 06:39:32: simon.cox@csiro.au commented


Yes, AKAIK Litre (Liter) is a special case in UCUM, in which two different codes are the case-sensitive code for units with identical definitions. This is a historical quirk. There are plenty of historical quirks in units, and on the whole UCUM has done a brilliant job cleaning up. In another life maybe a different decision could have been made, but this has been in UCUM for many years and I think we must live with it now. The only cost is that one of the symbols (either 'l' or 'L') is now sterilized from being used for something else.

Propose to close with status 'wontfix'.