hgrecco / pint-pandas

Pandas support for pint
Other
172 stars 42 forks source link

Pint-Pandas support for uncertain Quantities #139

Open MichaelTiemannOSC opened 2 years ago

MichaelTiemannOSC commented 2 years ago

Pint-Pandas implements PintType as an ExtensionDtype and PintArray as an ExtensionArray, brings Pint's Kung-Fu to Pandas!

This issue is about extending PintArrays to handle Quantities whose magnitudes are uncertainties. Pint already supports the concept of uncertainties with Measurement, and Measurement is derived from Quantity. But as far as I can tell, all of the existing Extension magic is implemented around magnitude and none at all around value, error. So Measurements work, but in the Pint world, not the Pint-Pandas world.

Looking at the problem from two perspectives (adapt Pint-Pandas to work with Measurements or enhance Pint-Pandas to deal with more general Quantity types), I chose to extend the range of allowable data types for magnitude in Quantity.

I have written a test case, run the pre-commit scripts, and invite your commentary. I know I need to write at least one more test case (which deals with Pint talking to itself in the print->read->eval process). But that test case might more properly belong in Pint. We'll see.

def test_issue_139():
    from pint.compat import HAS_UNCERTAINTIES
    assert(HAS_UNCERTAINTIES)
    from uncertainties import ufloat
    from uncertainties import unumpy as unp

    q1 = 1.234
    q2 = 5.678
    q_nan = np.nan

    u1 = ufloat(1, 0.2)
    u2 = ufloat(3, 0.4)
    u_nan = ufloat(np.nan, 0.0)
    u_plus_or_minus_nan = ufloat(0.0, np.nan)
    u_nan_plus_or_minus_nan = ufloat(np.nan, np.nan)

    a_m = PintArray([q1, u1, q2, u2, q_nan, u_nan, u_plus_or_minus_nan, u_nan_plus_or_minus_nan], ureg.m)
    a_cm = a_m.astype('pint[cm]')
    assert np.all(a_m[0:4] == a_cm[0:4])
    for x, y in zip(a_m[4:], a_cm[4:]):
        assert unp.isnan(x) == unp.isnan(y)