Open tadeu opened 7 years ago
In a different thread I have suggested to label units not only by their dimensions but also their role. (I do not have a better name). It's purpose is to provide a way to distinguish between quantities with the same same units but different "extended dimensionality". (See #505)
I think something like this could help also in your case if we allow roles to be provided on the fly
>>> q1 = ureg.Quantity(1.0, 'm:salt^3/m:water^3')
>>> q1
<Quantity(1.0, 'm^3/m^3')>
>>> q2 = ureg.Quantity(2.0, 'm:salt^3/m:water^3')
>>> q2
<Quantity(1.0, 'm^3/m^3')>
>>> q1 / q2
<Quantity(0.5, dimensionless)>
Nobody has opened yet a discussion about the API and how it should be implemented, but I think it is something worth pursuing.
I am also interested in this. I like the pint
project a lot and it helps me quite a bit in my job (civil engineer). This feature would be a great addition.
If I can find the time learn the pint
code base well enough, would taking a shot at implementing this be welcomed? Or is a discussion better first? Fair warning: I'm not a professional dev and have never tackled something like this; could easily be out of my depth, but a guy has to start somewhere.
As for the name, "role" seems ok,.. some other options might be: guise, mien, or mode... I like mode, but for abbreviated attribute access the letter "m" conflicts with the m in "magnitude". I do think I like role better than guise.
A commit for this will be most welcomed. I can guide you through.
Is the early API above, eg. unit:[role]{**power}
still being considered, or is the API still up for discussion?
My thoughts
# building up to ratio g/m^3 K+ to g/m^3 Na+ eg. relative concetrations in seawater
>>> q1 = ureg.Quantity(1.0, "role.K : g / role.water : m^3") # in-str annotation to be escaped
>>> q2 = ureg.Quantity(10.0, "g{} / m^3{}").role("Na", "water") # method, curly brace escape
>>> q_return = q1 / q2
# parse role directly into unit string for __repr__
<Quantity(0.1, "g K / m^3 water / g Na / m^3 water")>
>>> q3 = ureg.Quantity(1.0, "role:K: g / role:water: m^3") # different escape
>>> q4 = ureg.Quantity(10.0, "g / m^3").role("Na", "water") # method, inferred escape/insertion
>>> q_return = q3 / q4
# separate role into an extra return string in Quantity
<Quantity(0.1, "g/m^3 / g/m^3", "K/water, Na/water")>
I am agnostic on the return types shown - whatever makes more sense with pint's internals
q2 I think is most intuitive - in essence, all we are doing is annotating our units
1) It parallels python's string formatting. Anyone familiar with python should immediately parse the role arguments into the curly brace positions on reading. I admit, though, I have no idea if an escape is required beforehand, which would defeat the intuitiveness. Can python's str.format()
be overridden?
2) .role()
would escape the brackets before normal unit parsing while saving the information for later __repr__
, the curly brace themselves to serve as escape characters (possible??) and indicate that these units are have a role
annotation to the eye.
3) Escaping the curly braces for the role
information allows each unit to be reduced as normal if .to_reduced_units()
is chained.
4) Adding .role()
as a method of Quantity makes introspection easier. It also makes line breaks for deeply nested or long unit strings clearer.
5) .role()
could be added later eg. after some calculations or other input, and infer curly brace position via unit dimensionality and position in *args
(see q4 and the below)
5) Finally, .role()
without arguments would simply retain units, eg. slope = inch/inch
in @Ricyteach 's original SO question (reproduced in #389 ) - this would then override the "dimensionless"
unit
>>> q5 = ureg.Quantity(1.0, "inch")
>>> q6 = ureg.Quantity(1.0, "inch")
>>> q7 = q5.role() / q6.role()
<Quantity(1.0, "inch / inch")>
There is an important point to consider, however - a lot of people will only want to give a role to one of their units, eg. (again, @Ricyteach 's SO question)
>>> q8 = ureg.Quantity(1.0, "kip * ft / ft{}").role("member length")
<Quantity(1.0, "kip * ft / ft member length")>
Should .role()
enforce position here, i.e. "kip{} * ft{} / ft{}".role(_, _, "member length")
, raising a ValueError
otherwise? This ties into 5) above - should curly braces be inferred on each unit, and explicitly skipped over in .role()
via underscores to indicate position?
This has a lot of usefulness, especially for plotting, reporting, and once the pandas integration is stable. I would be willing to contribute as well - I have only a little time looking at the codebase, but I believe a .role()
method could prevent having to make changes to the registry classes.
Thanks for the insight and the great ideas. The API is totally open for discussions, and we do not have PR yet. The aspect that worries about your proposal is the ordering need. When Pint parses and operates on units, they are reordered. So g{} * m{}
could become m{} * g{}
. We need a solution for this but the rest looks good.
OH is that so? Is there a defined pattern to it, eg. precedence of certain dimensions over others? Where should I look for this reordering - registry.py? util.py?
Multiplication of units are stored in a dict and therefore the order is nos guaranteed in all supported Python versions.
FWIW: I have made no progress on this idea, and would be very happy to see someone else take a crack at it. I'd be willing to try it out when finished though.
Ok, but where exactly is this done? One place, many places? It seems that most of the parsing occurs in util.py, but I am unsure.
collections.OrderedDict
was added in 2.7 and handles LIFO or FIFO return via a popitem()
method
https://docs.python.org/2/library/collections.html#collections.OrderedDict
As long as its use is limited to ordinary dict
usage plus popitem()
, it should remain consistent between 2.7 and 3.X
Ok, as an initial point, replacing dict
with collections.OrderedDict
as the baseclass of util.udict
passes all existing tests without further modifications. Extending this to the dict()
instances created by registry methods (i.e. replacing with udict()
) also passes. So order can be retained, at least I think.
At this point, my crude understanding of pint's internals is that the markers for a .role()
method should be stripped out in util.string_preprocessor
while generating an ordered mapping of their position. Because the UnitsContainer mapping now retains order (I think), these orders should match up with an ordered mapping of the annotations in .role()
. I don't really understand what is going on in build_eval_tree
but it doesn't appear to destroy existing order? Or am I totally mistaken and that method is handling unit-aware math, i.e. shouldn't impact the proposed .role()
as that info is stripped out before any math takes place?
I'm in the process of writing a test for this at the string_preprocessor
and Quantity
level and hope to have something (working or not) by next week. I will only be working on cases where the units wouldn't reduce anyway for now, not sure where/how to flag for no-reduction when roles are attached yet.
The topic of corporate sustainability is heating up (due to Global Climate Change). Many companies report production intensity in terms of tonnes of CO2 emitted per unit of production. When the unit of production is tonnes of Steel, they expect an intensity metric of t CO2 / t Steel
which Pint reduces to CO2 / Steel
. It would be great to be able to preserve intensity as the former.
Ok, as an initial point, replacing
dict
withcollections.OrderedDict
as the baseclass ofutil.udict
passes all existing tests without further modifications. Extending this to thedict()
instances created by registry methods (i.e. replacing withudict()
) also passes. So order can be retained, at least I think.At this point, my crude understanding of pint's internals is that the markers for a
.role()
method should be stripped out inutil.string_preprocessor
while generating an ordered mapping of their position. Because the UnitsContainer mapping now retains order (I think), these orders should match up with an ordered mapping of the annotations in.role()
. I don't really understand what is going on inbuild_eval_tree
but it doesn't appear to destroy existing order? Or am I totally mistaken and that method is handling unit-aware math, i.e. shouldn't impact the proposed.role()
as that info is stripped out before any math takes place?I'm in the process of writing a test for this at the
string_preprocessor
andQuantity
level and hope to have something (working or not) by next week. I will only be working on cases where the units wouldn't reduce anyway for now, not sure where/how to flag for no-reduction when roles are attached yet.
What's the status of this?
Have another great example also from civil engineering world - Air Infiltration parameter for windows, measured in m ** 3 / (m ** 2 * hr)
which makes sense - how much air gets through over time for a particular window (glass). However, when simplified this becomes m / s
which is not so relevant.
I am new to pint (and it looks amazing!) so I don't know if I could contribute to this topic yet, but would appreciate the option for that for sure!
I'm interested in this feature as well. It's useful in chemical engineering, where mass and molar yields are used but are not equal (mole of product / mole of input != mass of product / mass of input), so preserving the starting units is important.
While there's been some discussion around the api of assigning and displaying "roles", I'm curious about some of the expected behavior when there are roles within units (I'm going to stick with the name "roles" since I don't have a better one).
Q: are units that have a role "isolated" from all other units? i.e. they can only be simplified with units of the same role? And only added with quantities that match all units and roles?
Basic usage, where same units with different roles do not simplify:
>>> u1 = ureg.Unit("g Na")
<Unit("gram Na")>
>>> u2 = ureg.Unit("g water")
<Unit("gram water")>
>>> u1 / u2
<Unit('gram Na / gram water')>
With that framework, it seem like units without roles should get treated as if they have their own role (a None
role let's say) and can only be simplified with other units of None
role?
>>> u1 = ureg.Unit("g Na")
<Unit("gram Na")>
>>> u2 = ureg.Unit("g")
<Unit('gram')>
>>> u3 = u1 / u2
<Unit('gram Na/ gram')>
>>> u3 / ureg.Unit("g")
<Unit('gram Na / gram ** 2')>
If I add a quantity with a role to one without a role, should it take the role of the first quantity or throw an error?
>>> q1 = ureg.Quantity("3.0 gram Na")
3.0 <Unit("gram Na")>
>>> u2 = ureg.Quantity("5.0 gram")
5.0 <Unit('gram')>
>>> u1 + u2
DimensionalityError: Cannot convert role "None" (gram) to role "Na" (gram)
Q: How are conversions handled when roles are present?
Should it be required to specify the role for any conversion, so it will only look at that subset of the units?
>>> q1 = ureg.Quantity("1.0 g Na / g water")
1.0 <Unit("gram Na / gram water")>
# gram Na -> ounce Na, water units ignored because it's a different role
>>> q1.to("ounce", role="Na")
0.035274 <Unit("ounce Na / gram water")>
Could it be possible to not specify a role? In that case, does it try to convert any roles it can? (couldn't think of a realistic situation, so using contrived units)
>>> q1 = ureg.Quantity("1.0 g Na * g water")
1.0 <Unit("gram Na * gram water")>
# Both Na and water roles have a dimensionality match to "ounce", so both are converted
>>> q1.to("ounce")
0.0012 <Unit("ounce Na * ounce water")>
>>> q2 = ureg.Quantity("1.0 g Na * m**3 water")
1.0 <Unit("gram Na * meter ** 3 water")>
# Na role has dimensionality match so converted - water role doesn't, so ignored
>>> q2.to("ounce")
0.035274 <Unit("ounce Na * meter ** 3 water")>
Or does not passing a role mean it defaults to None
role only (equivalent to passing
role=None), thus a target role is required?
>>> q1 = ureg.Quantity("1.0 g Na * g water")
1.0 <Unit("gram Na * gram water")>
>>> q1.to("ounce")
DimensionalityError: No units with role of "None" (ounce) found
>>> q2 = ureg.Quantity("1.0 g Na * g")
1.0 <Unit("gram Na * gram")>
# Only "gram" has role of None
>>> q2.to("ounce")
0.035274 <Unit("gram Na * ounce")>
This will result in behavior that some might find unexpected (again, there's probably a more realistic example...)
>>> car_weight = ureg.Quantity("3000 lb car")
3000 <Unit("pound car")>
>>> acceleration = ureg.Quantity("15.0 miles per hour per sec")
15.0 <Unit('mile / hour / second')>
>>> force = car_weight * acceleration
45000.0 <Unit('pound car * mile / hour / second')>
# role of None has units of mile / hour/ second
>>> force.to("newton")
DimensionalityError: Cannot convert from 'mile / hour / second' ([length] / [time] ** 2) to 'newton' ([length] * [mass] / [time] ** 2)
Q: Can unit definitions have roles? How would that work with conversions?
The conversion examples above take units within a role, covert them, and gives the resulting units the same role. The conversion itself does not consider roles, since it needs to account for any incoming role (e.g. water, salt, etc.) and none of the definitions currently have roles.
Issue https://github.com/hgrecco/pint/issues/505 mentions adding roles to definitions themselves, to distinguish dimensionless quantities that should be treated differently.
I think this could work if roles in definitions were only allowed on base units because, from what I can tell, you can't create a ureg.Unit with base units.
Thus, during a conversion the "outer" roles are removed, conversion happens that can take into account the base-unit roles, and the resulting units (without a role) get the "outer" role re-applied.
With @hgrecco's example:
radian = [:angle] = rad
bit = [:information]
count = [:]
>>> q1 = ureg.Quantity("1.0 bit / second")
1.0 <Unit('bit / second')>
>>> q1.to('count / second')
DimensionalityError: Cannot convert role "information" (bit) to role None (count)
Hopefully my explanations are reasonably clear. Looking forward to some feedback on these ideas. Thanks!
One issue I see here is
>>> q= ureg.Quantity(10.0, 'm^3/m^3')
>>> q
will reduce to
<Quantity(10.0, 'dimensionless')>
so q.role("circumference", "radius")
does not work.
For the same reason this example won't work
"kip{} * ft{} / ft{}".role(_, _, "member length")
We'd need a non-reducting UnitsContainer for this to work. ie the role discussion in this issue is indedepent of the original issue. I think a NonReducingUnitsContainer could work, storing units and exponents in tuples and converting to dict when needed.
q= ureg.Quantity(10.0, 'm^3/m^3')
q.units
<NonReducingUnitsContainer [{'meter' : 3}, {'meter'}: -3] >
I think the NonReducingUnitsContainer
could be used only in Quantity.to
and Quantity.__init__
at first
This use case is an example for why "quantity kinds" are considered in unit-of-measurement models. Work has been started in #1967 to add quantity kinds to pint. It seems that the concept of "quantity kind" is called "role" here?
Yes, this issue is several years old. Newer issues have use kind as the term instead.
First of all thanks for the excellent library!
In engineering it is common to have measurement units like
m^3/m^3
which represents a volume fraction or volume ratio, such as "water volume/total volume". Although it is the same as dimensionless, it is important to keep this representation as to distinguish it from other dimensionless ratios such askg/kg
or(mg/l)/(mg/l)
.What happens today is:
What is the desired behaviour:
In other words, we don't want to "reduce" the unit automatically.
Note that this already works for units such as
cm^3/m^3
. In this case, the unit is preserved even though it could be reduced to only an1e-6
factor: