Question: base units with different dimensions

cdeimert commented 2 years ago

Hi, I'm wondering if it's possible to define base units with different dimensions than the standard SI ones? Maybe I've misunderstood, but it seems like the @system definition approach complains if you try to define something like "eV" as a base unit (I got the error 'The new base must be a root dimension if not discarded unit is specified.')

In some contexts of my work, it would be more convenient to use hbar, eV, and nm as base units in place of m, kg, s. You can always map any combination of (m, kg, s) into a combination of (hbar, eV, nm), so it would be nice if there was some simple function to do that.

To give a more concrete example, if you use Planck units as currently defined in the default unit definitions, then Q_(4e8, 'm/s').to_base_units() gets displayed as 1.334... planck_length/planck_time. Personally, I think it could be much nicer if the base units in the Planck unit system were actually 'c, gravitational_constant, hbar, kC, k', so that `Q(4e8, 'm/s').to_baseunits()would display as1.334... c, andQ(1, 'planck_length').to_base_units()would display as1 hbar * 0.5 G 0.5 / c 1.5`.

I'm sure I can write some conversion function to do this manually, but I wonder if there's already a convenient way to do it from within pint? If not using base units, even something like my_quantity.to({'hbar', 'eV', 'nm'}) would be useful, where the argument is a set rather than a dictionary, and the to() function determines the appropriate exponents (if there's a valid conversion). Maybe that exists and I missed it?

Regardless, thanks for all the work you've put into the package!

jules-ch commented 2 years ago

Systems are there to map units to base dimensions.

in SI:

length -> meter time -> seconds ...

What you are displaying wannot be achieved in practice since you can have combination of tour constants that can be reduced to the same results. That's why we are using base units. c, hbar, G are constants that can not be mapped to base dimensions.

cdeimert commented 2 years ago

As long as the constants/units are appropriately chosen, I think there should be a unique mapping to express any quantity in terms of those constants/units. It won't work for any arbitrary choice, of course -- they need to be complete (same number as the number of dimensions you're replacing) and independent.

For example, say I want to replace the SI base units (meter, second) with (meter, c) instead -- c being the speed of light. If I want to express a quantity with dimensions [length]ⁿ[time]^p in terms of (meter, c) then the only way to do it is as a multiple of m^n+p c^-p. So there's a unique mapping. You could just as well use [speed] as a base dimension instead of [time], and you could use "c" as your unit for speed.

I understand if it's not worth doing in the implementation, but I do think it should in principle be possible. I'm pretty sure you could cast it as a matrix problem, so that the invertibility of the matrix determines whether you've chosen a "good" set of constants/units or not.

cdeimert commented 2 years ago

In case this is useful to anyone else, I wrote a function for this. I don't know that it's super efficient or handles all the edge cases appropriately, but it seems to work well enough for my purposes. It sets up a matrix equation to figure out the appropriate units to convert to, and then it uses numpy's pseudo-inverse function pinv to solve it.

The tricky part is dealing with cases where the matrix ends up being non-square but there's nevertheless a unique solution (E.g., converting 1 m to {ft, hr}). Essentially, in these cases if you do the row reduction by hand, you'll end up with rows/columns of zeros that can be eliminated to turn it into a square matrix. Numerically that's a bit of a pain, so pseudo-inverse has the advantage of just always giving you a solution.

If there's a unique solution, it should give the correct one
If there are many solutions, it will give an arbitrary one (e.g., if you ask to convert 1 m² to {m, ft}, it returns in m*ft)
If there are no solutions, pinv gives an incorrect answer, but the Quantity.to() call at the end will throw a DimensionalityError

So as long as you've chosen your set of units well, it should work fine.

import numpy as np

from pint import Quantity, UnitRegistry
from pint._typing import UnitLike
from pint.util import to_units_container, UnitsContainer

ureg = UnitRegistry()
Q_ = ureg.Quantity

def to_arbitrary_base(
    q: Quantity, new_base: set[UnitLike], decimal_cutoff=10
) -> Quantity:
    '''Converts a Quantity, q, into a new set of base units, 'base_units'.

    Uses numpy's pseudo-inverse matrix function pinv. To avoid floating point
    errors, unit powers are rounded to [decimal_cutoff] decimal points.

    If there is no solution, will throw DimensionalityError.

    If the problem is overspecified (too many base units given), the solution
    is not unique. In this case, will return *a* correct conversion, as chosen
    by the matrix pseudo-inverse method.
    '''

    ### Set up matrix equation
    # Convert to root units and get UnitContainers for the input quantity
    # and for each of the supplied new_base units
    q_uc = to_units_container(q.to_root_units())
    S_uc = {us: to_units_container(Q_(us).to_root_units()) for us in new_base}

    # Find all of the root units that appear in either of the above
    root_unit_set = set(q_uc.keys())
    for col in S_uc.values():
        root_unit_set = root_unit_set | set(col.keys())

    # Convert to lists to make sure the order is preserved
    root_units_list = list(root_unit_set)
    new_base_list = list(new_base)

    # Express q's root unit powers as a vector
    qvec = np.array([[q_uc[u]] for u in root_units_list])

    # Express root unit powers of new_base as a matrix
    Smat = np.array(
        [[S_uc[uo][ub] for ub in root_units_list] for uo in new_base_list]
    ).transpose()

    ### Solve matrix equation using pseudo-inverse
    Smat_pinv = np.linalg.pinv(Smat)
    Qvec = np.matmul(Smat_pinv, qvec)

    # Round based on specified decimal_cutoff
    Qvec = np.squeeze(np.round(Qvec, decimals=decimal_cutoff))

    ### Convert result to units string
    Q_units = '*'.join(
        f"{u} ** {Qv}" for u, Qv in zip(new_base_list, Qvec) if Qv != 0
    )

    ### Finally do the conversion (throws DimensionalityError if impossible)
    return q.to(Q_units)

if __name__ == "__main__":
    print(to_arbitrary_base(Q_(3e8, 'm/s'), {'c', 'hbar', 'eV'}))

Some example conversions (I defined G = gravitational_constant for convenience): ---- Convert to Planck units (c, G, hbar, k_C, k) ---- 1 m --> 6.1871×10³⁴ G^0.5 ħ^0.5/c^1.5 1 kg --> 4.5947×10⁷ c^0.5 ħ^0.5/G^0.5 1 s --> 1.8549×10⁴³ G^0.5 ħ^0.5/c^2.5 1 K --> 7.0582×10^-33 c^2.5 ħ^0.5/(G^0.5 k) 3×10⁸ m/s --> 1.0007 c 1×10^-34 kg m²/s --> 0.94825 ħ 1 planck_length --> 1 G^0.5 ħ^0.5/c^1.5 ---- Convert to: (nanometer, eV, hbar, K, e) ---- 1 m --> 1×10⁹ nm 1 J --> 6.2415×10¹⁸ eV 1 J s --> 9.4825×10³³ ħ 1 Ω --> 0.00024341 ħ/e² ---- Convert to: (eV, hbar) ---- 1 m --> raised DimensionalityError 1 J --> 6.2415×10¹⁸ eV ---- Convert to: m, ft ---- 1 m² --> 3.2808 ft m

juhuebner commented 2 years ago

Very nice implementation. 👍 I changed it a little such that

it can handle underspecified sets like converting boltzmann_constant with {"eV",}
it uses the internal pint routines for linear algebra in order to maintain arbitrary precision (using Fraction)

There might still be some redudancy in the sets, though. @cdeimert What about submitting a PR with your routine included in the Quantity class? This would ease the exchange of code as well.

from pint import Quantity, UnitRegistry
from pint._typing import UnitLike
from pint.util import to_units_container, UnitsContainer, column_echelon_form, transpose

ureg = UnitRegistry()
Q_ = ureg.Quantity

def to_arbitrary_base(
    q: Quantity, new_base: set[UnitLike]) -> Quantity:
    '''Converts a Quantity, q, into a new set of base units, 'base_units'.

    Uses numpy's pseudo-inverse matrix function pinv. To avoid floating point
    errors, unit powers are rounded to [decimal_cutoff] decimal points.

    If there is no solution, will throw DimensionalityError.

    If the problem is overspecified (too many base units given), the solution
    is not unique. In this case, will return *a* correct conversion, as chosen
    by the matrix pseudo-inverse method.
    '''
    # Make a 'copy'
    new_base_set = set(new_base)

    ### Set up matrix equation
    # Convert to root units and get UnitContainers for the input quantity
    # and for each of the supplied new_base units
    q_uc = to_units_container(q.to_root_units())
    S_uc = {us: to_units_container(Q_(us).to_root_units()) for us in new_base_set}

    # Remove 'needless' units from new_base
    for key, val in S_uc.copy().items():
        if not all([unit in q_uc for unit in val]):
            S_uc.pop(key)
            new_base_set.remove(key)

    # Find out 'missing' units in the new_base units.
    missing_units_in_new_base = set()
    for val in S_uc.values():
        missing_units_in_new_base.update(val)
    missing_units_in_new_base.symmetric_difference_update(q_uc.keys())

    # Update new base with missing unit.
    for key in missing_units_in_new_base:
        S_uc[key] = UnitsContainer(**{key : 1})

    # Get the root units set
    root_unit_set = set(q_uc.keys())

    # Convert to lists to make sure the order is preserved
    root_units_list = list(root_unit_set)
    new_base_list = list(new_base_set | missing_units_in_new_base)

    # Express q's root unit powers as a vector
    qvec = [q_uc[u] for u in root_units_list]

    # Express root unit powers of new_base as a matrix
    Smat = [[S_uc[uo][ub] for ub in root_units_list] for uo in new_base_list]

    ### Linear algebra section for solving matrix equation
    Smat.append(qvec)

    cef_Smat, _, _ = column_echelon_form(Smat) 

    Qvec = transpose(cef_Smat)[-1]

    ### Convert result to units string
    Q_units = '*'.join(
        f"{u} ** {Qv}" for u, Qv in zip(new_base_list, Qvec) if Qv != 0
    )

    ### Finally do the conversion (throws DimensionalityError if impossible)
    return q.to(Q_units)

# to_arbitrary_base(Q_(1, ureg.boltzmann_constant), {"eV", "nm",})
# ---- Convert to: (nanometer, eV, hbar, K, e) ----
system_set = {"nm", "eV", "hbar", "K", "e"}
test_quantities = (1 * ureg.m, 1 * ureg.J, 1 * ureg.J * ureg.s, 1 * ureg.ohm,  1 * ureg.boltzmann_constant, 1 * ureg.electron_mass)

for quan in test_quantities:
    print(f"{quan:~} -> {to_arbitrary_base(quan, system_set):.3e~}")

juhuebner commented 2 years ago

Ah. It currently works in SI units only ..., which does not meet aim of this issue ☹️

hgrecco / pint

Question: base units with different dimensions #1556