hgrecco / pint-pandas

Pandas support for pint
Other
166 stars 41 forks source link

pd.Series.round does not work when pd.Series contains PintArray #144

Open MichaelTiemannOSC opened 1 year ago

MichaelTiemannOSC commented 1 year ago

It is convenient to use pd.Series.round, but not with PintArrays. Here is a testcase:

import pandas as pd
import pint
import pint_pandas

aa = pd.Series([1.2345678, 2.3456789])
print(aa)
print(aa.round(2))

bb = pd.Series([1.2345678, 2.3456789], dtype='pint[m]')
print(bb)
print(bb.round(2))

Here is the program output (with a UnitStripped warning message removed):

0    1.234568
1    2.345679
dtype: float64
0    1.23
1    2.35
dtype: float64

0    1.2345678
1    2.3456789
dtype: pint[meter]
Traceback (most recent call last):
  File "/Users/michael/Documents/GitHub/ITR-MichaelTiemannOSC/examples/pint-round.py", line 11, in <module>
    print(bb.round(2))
  File "/Users/michael/Documents/GitHub/pandas/pandas/core/series.py", line 2602, in round
    result = self._values.round(decimals)
AttributeError: 'PintArray' object has no attribute 'round'

Is there a reasonable way to define what mathods are intended to work on dimensionless units and allow delegation to the array of magnitudes to take place? PintArray does delegation for a number of reduce functions. Perhaps it could delegate for things like rounding as well.

andrewgsavage commented 1 year ago

round is a bit odd because it is expecting PintArray.round, which isn't specified in https://pandas.pydata.org/docs/dev/reference/api/pandas.api.extensions.ExtensionArray.html It might work if __array_ufunc__ is implemented, not sure on that. Will open a pandas issue if it doesn't.

Did find this https://github.com/pandas-dev/pandas/issues/26730 from a long time ago, sheds a little light. (can't expect extensionarray implementers to implement every possible pandas method)

I agree, it would be great to have a tracker list of what works/doesn't, especially now the scalar issue has been fixed. I'd welcome a tracker issue

Looking through the pandas Series docs, https://pandas.pydata.org/docs/dev/reference/series.html , there's a lot of methods! I think most are already tested in the extensiontests.

Actually I expect most functions listed there to work; the indexing, binary ops, construction, missing, reshaping operations are checked in the pandas extensiontests. The methods in the computations section aren't tested so would be worthwhile checking. https://pandas.pydata.org/docs/dev/reference/series.html#computations-descriptive-stats

andrewgsavage commented 8 months ago

https://github.com/pandas-dev/pandas/pull/54582 is taking forever to get anywhere!