labrad / pylabrad

python interface for labrad
52 stars 29 forks source link

Units with offsets are problematic #141

Open maffoo opened 9 years ago

maffoo commented 9 years ago

In thinking about #140, I was reminded of the horror show that is the implementation of units with offsets, e.g. degrees Fahrenheit and degrees Celsius. Supporting these leads to extra complexity in the library, and there are a lot of things that are just broken. For example, if I want to deal with absolute temperatures, then I need to take the offsets into account when converting between, say, K and degC; but if we're talking about changes in temperature, then I should use only the multiplicative factor to convert between the units, because the offset drops out when I take a difference between two celsius temperatures. I don't really see a good way to fix this, and my gut response would be to just drop degF and degC, but I'm curious whether other people see this as a problem and what we should do.

See also #74 which deals with logarithmic units like dBm. Right now the units library treats these units as just names essentially, and so will happily operate on them in ways that are unexpected.

ejeffrey commented 9 years ago

I am happy to get rid of C and F and provide conversion functions. We could make that a bit nicer by:

1) Eliminating C and F units 2) Having Unit('C') and Unit('F') create special objects that convert to kelvins, so 0*U.('C') constructs the object 273.16 K 3) having ['C'] and ['F'] continue to work on temperatures in Kelvin. 4) In line with (1), disallow inUnitsOf('C') and other operations that would try construct a Value with C or F units.

This may be too much magic, and we should just create explicit helper functions to convert between offset temperature scales and C.

On a related note, I have been thinking that we might get a performance boost by refactoring units into an SI-only system and a non-SI layer on top of that. This way, as long as you stick with SI, unit arithmetic only has to add and subtract the exponent of 10. All the extra conditional logic and weird factors would only show up when you use non-SI units.

On Tue, Jul 21, 2015 at 9:49 AM, Matthew Neeley notifications@github.com wrote:

In thinking about #140 https://github.com/labrad/pylabrad/issues/140, I was reminded of the horror show that is the implementation of units with offsets, e.g. degrees Fahrenheit and degrees Celsius. Supporting these leads to extra complexity in the library, and there are a lot of things that are just broken. For example, if I want to deal with absolute temperatures, then I need to take the offsets into account when converting between, say, K and degC; but if we're talking about changes in temperature, then I should use only the multiplicative factor to convert between the units, because the offset drops out when I take a difference between two celsius temperatures. I don't really see a good way to fix this, and my gut response would be to just drop degF and degC, but I'm curious whether other people see this as a problem and what we should do.

See also #74 https://github.com/labrad/pylabrad/issues/74 which deals with logarithmic units like dBm. Right now the units library treats these units as just names essentially, and so will happily operate on them in ways that are unexpected.

— Reply to this email directly or view it on GitHub https://github.com/labrad/pylabrad/issues/141.

patzinak commented 8 years ago

I wasn't sure whether this is worth of opening a separate issue but I have been several times confused by the principles behind the units compatibility, especially during a transition from an obsolete to the current module version.

Are there any fundamental reasons apart from the implementation problem that the following statements should not return True.

from labrad import units
(units.W).isCompatible(units.dBm)
(units.Unit('')).isCompatible(units.dB)
(units.Unit('')).isCompatible(units.rad)

The first case is the least questionable: this is the case when converting to SI units internally is a reasonable option, in a similar way as all temperature units could be converted to kelvins before doing any operations on them.

The second case is tricky because we tend to think about dB in additive way, i.e. 10 dB + 20 dB is 30 dB but there might be a need to evaluate statements like this 1 * units.W / 1 * units.mW > 10 * units.dB which is simply an invalid statement in the current implementation.

The third case is the most questionable, since someone always can argue that you should never use radians for anything which is not angle or phase.

ejeffrey commented 8 years ago

3 is pretty questionable. On the one hand, radians are technically dimensionless, on the other hand, if that is the case, why bother having the radian unit at all? I would want to see an evaluation of potential errors that keeping them separate can fix, as well as use cases that make sense that break when '' and 'rad' are not compatible.

1 & 2) I have been thinking about ways to do dBm and dB that would basically make them "display only" units. dBm would be stored internally as watts, and dB would be dimensionless. However, this gets pretty tricky to behave in a way that makes sense. In particular, it would probably require that the * and + operators work on the linear scale, not the log scale, so you would have to use * for "adding dB". This isn't a terribly big deal, but it is different from what people expect, and again: why bother using log units then? So for now, dB and dBm are basically stand-alone units: they are not convertible to or from any other unit.

One thing we might do for both dB/dBm and F/C is to just get rid of them as first class units and replace them with special purpose input/output converters that convert to the appropriate SI units. That would remove a lot of the ambiguity in the use of these units at the expense of requiring special casing to use them.

If you have any thoughts on the matter, please feel free to weigh in, one reason this hasn't gotten fixed is that we don't know what people would like.

Evan

On Mon, Jan 11, 2016 at 1:03 AM, patzinak notifications@github.com wrote:

I wasn't sure whether this is worth of opening a separate issue but I have been several times confused by the principles behind the units compatibility, especially during a transition from an obsolete to the current module version.

Are there any fundamental reasons apart from the implementation problem that the following statements should not return True.

from labrad import units (units.W).isCompatible(units.dBm) (units.Unit('').isCompatible(units.dB) (units.Unit('').isCompatible(units.rad)

The first case is the least questionable: this is the case when converting to SI units internally is a reasonable option, in a similar way as all temperature units could be converted to kelvins before doing any operations on them.

The second case is tricky because we tend to think about dB in additive way, i.e. 10 dB + 20 dB is 30 dB but there might be a need to evaluate statements like this `1 * units.W / 1 * units.mW > 10 * units.dB' which is simply an invalid statement in the current implementation.

The third case is the most questionable, since someone always can argue that you should never use radians for anything which is not angle or phase.

— Reply to this email directly or view it on GitHub https://github.com/labrad/pylabrad/issues/141#issuecomment-170474395.

patzinak commented 8 years ago

As for (3), I kind of prefer seeing units.rad being a separate unit from units.Unit(''). When properly used, this should certainly help to avoid lots of unintended situations but, of course, creates a problem when someone evaluates the sum of a phase and a product of time and frequency. There are two equally good or bad workarounds: either do not use radians or do explicit unit conversion, which both look really trivial. I am not aware of any other problematic situations.

As for (1) and (2), this all depends on how hard to refactor the current implementation. If this not is easy, I would be totally fine with separate methods/utilities to covert problematic units/values from/to SI units/values. This is not a pythonic approach but I would rather see the code throwing exceptions than allowing the things that do not make much sense. Specifically, for units.dB, we can have it as it is, and have a method or something that allows to convert it to a dimensionless quantity, i.e. something like (10 *units.dB).dimless(). units.Unit('') should also have the dimless method implemented in this case. Again, while in general, the clean and pure user interface is a must, given the available resources, I would sacrifice the interface and its flexibility in order to explicitly prohibit making any hard-to-trace but conceptually stupid mistakes. In my experience, we can quickly implement a workaround to a problem in a module, once we know exactly what is the problem but it takes tons of time to trace down the issue if an exception is not thrown or seen. [Another totally unrelated example here is the gpib.py module and the GPIB server. It took us way more time to trace down the issues there then to implement fixes and workarounds, and so far the things work very well for us.]

ejeffrey commented 8 years ago

The real problem with dB/dBm and degC/degF is not so much refactoring to support "correct" behavior, but defining correct behavior. It is really hard to come up with a consistent set of rules for how those units interact with other units. The problem with dB[m] is that you probably want to it act logarithmic sometimes. Basically it comes down to this: are you willing to accept "10 dB + 10 dB == 13 dB; 10 dB * 10 dB == 20 dB", or do you want "10 dB + 10 dB == 20 dB"? I don't think we can make the latter work if you also want "20 dBm/10 Hz == 10 dBm/Hz".

The problem with temperature scales is that sometimes 'degC' means an actual temperature, and sometimes it means a temperature difference. So if I have a specific heat of 4.18 Joule/g*K, and I multiply by "10 degC", normally what I mean is that the temperature difference is 10 C/ 10 K, and the answer is 41.8 Joule/g. But for something like black body radiation, you would actually need to use the absolute temperature. Now you can force temperature differences to always be in kelvin, and when you do that, you end up making any non-trival arithmetic convert to kelvin anyway, so 'degC' and 'degF' are really just for IO.

On Tue, Jan 12, 2016 at 7:06 PM, patzinak notifications@github.com wrote:

As for (3), I kind of prefer seeing units.rad being a separate unit from units.Unit(''). When properly used, this should certainly help to avoid lots of unintended situations but, of course, creates a problem when someone evaluates the sum of a phase and a product of time and frequency. There are two equally good or bad workarounds: either do not use radians or do explicit unit conversion, which both look really trivial. I am not aware of any other problematic situations.

As for (1) and (2), this all depends on how hard to refactor the current implementation. If this not is easy, I would be totally fine with separate methods/utilities to covert problematic units/values from/to SI units/values. This is not a pythonic approach but I would rather see the code throwing exceptions than allowing the things that do not make much sense. Specifically, for units.dB, we can have it as it is, and have a method or something that allows to convert it to a dimensionless quantity, i.e. something like (10 *units.dB).dimless(). units.Unit('') should also have the dimless method implemented in this case. Again, while in general, the clean and pure user interface is a must, given the available resources, I would sacrifice the interface and its flexibility in order to explicitly prohibit making any hard-to-trace but conceptually stupid mistakes. In my experience, we can quickly implement a workaround to a problem in a module, once we know exactly what is the problem but it takes tons of time to trace down the issue if an exception is not thrown or seen. [Another totally unrelated example here is the gpib.py module and the GPIB server. It took us way more time to trace down the issues there then to implement fixes and workarounds, and so far the things work very well for us.]

— Reply to this email directly or view it on GitHub https://github.com/labrad/pylabrad/issues/141#issuecomment-171146504.

patzinak commented 8 years ago

This actually answers my original question about the principles. I understand now that the answer is that we don't have/haven't yet come up with a good set of rules not only for units compatibilities but for many other things related to them such as operations on the units and unit conversions.

Not pretending to give a fully consistent picture, let me formulate my thoughts more clearly if this could be anyhow helpful. My impression that the discussion is biased towards physical scales and operations on units but not physical quantities and meaning.

  1. In any questionable situation, it would be great to have an Exception thrown. As time goes on, some exceptions may be removed when a good idea is implemented but preferably not the other way around. Again, it is easy to implement a workaround in a particular applications because particular applications usually have a more limited and a better defined scope.
  2. Units are used to describe some physically meaningful quantities. The choice of specific units is only a matter of preferences. Thus, it is a totally valid option to have 1 * dBm to return Value(1, mW) or Value(0.001, W). 20 dBm / 10 Hz is equal to Value(100, mW) / Value(10, Hz) and will return Value(0.01, W/Hz). In other words, as this is done with most non-SI units, convert everything to SI units first. This should be a pretty trivial case because dBm is a simple, physically meaningful unit, unlike dB. While 'degC/degF' is basically one issue, 'dB/dBm' are two different issues.
  3. For the units with offsets, it is hard to agree whether 10 * degC == 10 * K. This used to be evaluated to True, now this is evaluated to False. This depends whether you imply the actual temperature or, let's say, just an increase in temperature. There is no way to tell what you mean by just looking at this expression without any context. To make the matter more complicated: why not to throw an error? Because of all potential confusions, I would support the idea of removing these units when simple conversions functions/methods are implemented, so that the user applications do not need to redefine conversion constants each time. The user, however, is explicitly forced to do a meaningful unit conversion to a meaningful physical quantity and will pass only SI values all over. If someone for any reasons wants to print specific heat in units of J/g*degF this could be achieved by very trivial string and/or value manipulations, and if really necessary a function/method that does this can be implemented. Basically, why not to adhere to "Explicit is better than implicit".
  4. dB is the most tricky case and in my opinion this case is actually somewhat similar to 'rad'... Decibels are used to compare physical quantities, they are not physical quantities per se. The same way, one can define unit times, i.e. evaluating 10 m / 10 cm to 10 times instead of just 10. I actually like the current, naive implementation of the decibels and I don't think we should look for any extra generalizations of the related operations. There is no real need to bother generalizing expression such as 10 dB + 10 dBm (note, dB and dBm are mixed on purpose here). The reason is that if 10 dBm / 1 dBm returns 10 * dB, then 1 V^2 / .1 V^2 should also return 10 dB and not just 10. What would be great is to have an option to convert dB to dimensionless units. Here is the suggestion: keep the compatibilities defined the way they are defined now, and do not generalize method inUnitsOf. However, add a method isGenerallyCompatible and forceToUnits (of course, some better names should be chosen if possible). These methods will treat '', rad and dB as '', i.e. 9 rad dB^2 / V will be converted to (3 * dB)^2/V or roughly Value(4, 'V^-1'). These methods will work as isCompatible and inUnitsOf in all other trivial cases.
patzinak commented 8 years ago

Another option: introduce two more units like DdegC and DdegF (D stands for Delta or difference) and forceful, slightly untrivial conversion methods. This way if someone writing a bake-out controller, this person can write something if temperature < 200 * degC: increase_temperature(1 * DdegC); wait(1 * min). 10 * degC == 10 * K is False but 10 * DdegC == 10 K is True. This is a mess but this could be a better option than the current implementation. Removing these units. however, sounds like a way much better idea.