Open clbarnes opened 2 years ago
I think I am not following. In Pint two quantities are equal if they have the same magnitude and unit (after converting one to the other units). So it is not possible to have Quantity(1, "micron") == Quantity(1, "micrometer")
unless Unit("micron") == Unit("micrometer")
. And in fact this is what happens.
>>> import pint
>>> ureg = pint.UnitRegistry()
>>> ureg.Unit("micrometer") == ureg.Unit("micrometer")
True
>>> ureg.Unit("micrometer") != ureg.Unit("micrometer")
False
Is there a typo in your snippet? For my point, at least, I think the second comparison should be between micron and micrometer.
In a clean virtual environment with brand-new pip install pint
, I get
In [1]: import pint
In [2]: ureg = pint.UnitRegistry()
In [3]: ureg.Unit("micron") == ureg.Unit("micrometer")
Out[3]: False
In [4]: ureg.Quantity("1 micrometer") == ureg.Quantity("1 micron")
Out[4]: True
I think this is because, as you say, when quantities are compared, one is converted to the other before checking and then the magnitudes are compared. The definitions file tells you how to convert one unit into another. However, when there's no magnitude, no conversion takes place, and so the two units are simply compared (and the names are different, so it's false). If they're defined as an alias, they are equal, so my proposal is that trivial conversions like micron/micrometer automatically define aliases as well.
You are right, sorry about that.
But I still do not follow. What do you mean by defining them as alias. They are alias. The definition file defines units, not quanities.
For the record, in the past Units were NOT exposed to the user and were just dicts. When the Unit Class was introduced the previous behavior was kept. I wonder if there is any penalty or problem to change to canonic names upon comparison (or on definition while keeping what the user has chosen as well)
As I understand it, the definition file, in this context, has two modes of operation: conversions ("how to turn a quantity with one unit into a quantity with another unit") and aliases ("which sets of units mean the same thing"). Aliases imply a conversion (I think), but trivial conversions do not, at present, imply an alias.
Again, as I understand it:
# tell pint how to convert into microns into half-microns
half_micron = 2 * micron
# tell pint how to convert microns into micrometers
micron = micrometer
# which is implicitly the same as
micron = 1 micrometer
# tell pint that microns and micrometers are the same thing
@alias micron = micrometer
Saying that a micron is equal to 1 micrometer is not the same thing, in pint's eyes, as saying that micron is an alias for micrometer. My suggestion is that micron = 1 micrometer
is parsed and implicitly creates the alias as well.
You can check this:
>>> import pint
>>> ureg1 = pint.UnitRegistry()
>>> ureg1.Unit("micron") == ureg2.Unit("micrometer")
False
>>> ureg2 = pint.UnitRegistry()
>>> ureg2.define("@alias micron = micrometer")
>>> ureg2.Unit("micron") == ureg2.Unit("micrometer")
True
You need to define 2 uregs for this to work, I suspect because the new definition doesn't remove the cached comparison (maybe it should).
The definition file only contains one thing relating: units definitions. For each unit you can define the name, conversion to other units, symbol and aliases. That is all.
For example
>>> ureg1.Unit("meter") == ureg1.Unit("metre")
True
>>> ureg1.Unit("meter") == ureg1.Unit("m")
True
because the definition file states:
meter = [length] = m = metre
So if you change the defintion file to
micron = 1e-6 * meter = µ = micrometer
you should get
>>> ureg.Unit("micron") == ureg.Unit("micrometer")
True
(I am guessing here, haven't tested)
What you are seeing is a bug. I think do to the fact that we do put the conversion in a canonical form.
We should fix it, thanks for spotting this.
This is already in the default definition file: https://github.com/hgrecco/pint/blob/master/pint/default_en.txt#L147 and unfortunately, that last test does not pass, unless an explicit @alias
is given as documented here https://pint.readthedocs.io/en/stable/defining.html If it's just a bug, then great! No design decisions to be made :)
No. The current definition that you are linking states:
micron = micrometer = µ
Have you tried replacing it for my suggestion?
careful micrometers are 1e-6 meters :)
Yes! An ugly error on my side. But I still wonder if thia approach fixes the bug
(I edit the comment to avoid having something THAT wrong around :-D)
Indeed, changing the definition to this micron = 1e-6 * meter = µ = micrometer
fixes the problem. But I think it would be better to always convert units in the converters to base units to avoid these problems.
PR raised!
I think it would be better to always convert units in the converters to base units to avoid these problems.
Would canonicalising units in __hash__
also be on the table?
Yes, but in another PR. We need to see how it's impact. Wha tdo you think @jules-ch ?
Example: microns and micrometers are the same thing: i.e.
Quantity(1, "micron") == Quantity(1, "micrometer")
because in the default registry configuration, there's the linemicron = micrometer
. However,Unit("micron") != Unit("micrometer")
, i.e. they are not aliases, unless you set it in your local registry.Would it be possible to have unit definitions for
X
andY
whereN X = N Y
automatically be added as an alias as well?Additionally, while aliases affect
__eq__
, they don't affect__hash__
, which means they are treated differently in sets and dicts. Would it be too breaking a change to implement this, possibly by determining a normative name for all unit aliases (in the same way that abbreviations map to the full unit name) to hash against?