IAMconsortium / units

Common unit definitions for integrated assessment research
GNU General Public License v3.0
20 stars 11 forks source link

pint can't handle 'million', 'billion', ... #14

Closed danielhuppmann closed 4 years ago

danielhuppmann commented 4 years ago

In integrated-assesment models, we have variables like 'Population' which are given in million or billion (people being implicitly understood).

It would be practical of being able to do the following:

pint.UnitRegistry()('1000 million').to('billion')

Expected output: 1 billion

danielhuppmann commented 4 years ago

a related issue: for currency, one commonly talks about million USD_2010 rather than MUSD_2010... and mUSD_2010 (which is more intuitive for an economist than MUSD) would be interpreted by pint as milli-USD_2010...

khaeru commented 4 years ago

Pint seems to have a preprocessors argument when creating a UnitRegistry (reference); that could be a way to handle things like this. I've never used it, so someone would need to figure out how it works.


General thoughts: in a sense, this is an instance of a broader class of problem.

Another instance: some transport code I've seen recently uses litres / 100 km as a 'unit', and e.g. 23 as a magnitude. In this case, the m is sort of "double-prefixed": first k meaning 1000, and then the literal 100. So the 'units' are litres / 10⁵ m; but there's no way to express this with only SI prefixes.

Pint distinguishes between Quantities that (per SI) have a magnitude and units; and Units, which cannot have a 'magnitude' or other scaling factor:

>>> import pint
>>> pint.Unit('litres / (100 km)')
[…]
ValueError: Unit expression cannot have a scaling factor.

>>> pint.Quantity('litres / (100 km)')
0.01 <Unit('liter / kilometer')>

>>> q = pint.Quantity('23 litres / (100 km)')
>>> q
0.23 <Unit('liter / kilometer')>

>>> pint.Quantity(23, 'litres / (100 km')  # 2-arg form; 2nd is parsed as unit
[…]
ValueError: Unit expression cannot have a scaling factor.

In general, to avoid over-complicating things, code in this repo and client packages should:

  1. At the time of input, handle particular discipline-specific expressions in order to produce ordinary pint objects.
  2. For calculations, let pint's robust internal logic take care of unit conversions. This might produce outputs in a variety of units (e.g. averaging quantities in EJ, toe, and kWa) but will be correct in both magnitude and dimensionality.
  3. At the time of output, coerce output into the desired form.

As an example of (3), if I wanted to output q in the 'units' litres / 100 km:

>>> output_units = pint.Quantity('litres / (100 km)')  # use Quantity instead of Unit
>>> magnitude = q / output_units
>>> magnitude
23.0 <Unit('dimensionless')>

The magnitude and some string form of the output_units can be used to generate the preferred output. This is the pattern used in, e.g. the MESSAGE reporting code.

danielhuppmann commented 4 years ago

quick thought - adding a line defintions.txt

100_km = 100 * km

would solve that problem, right?

khaeru commented 4 years ago

It would, yes, but I'm choosing to look at that transport case as input that needs to be sanitized, rather than another new unit to define.

danielhuppmann commented 4 years ago

18 closes the original issue I encountered - @khaeru, I suggest that you start another issue for the litres / 100 km question...