hgrecco / pint-pandas

Pandas support for pint
Other
166 stars 40 forks source link

Add native ufunc support for PintArrays #222

Closed Nick-Hemenway closed 1 month ago

Nick-Hemenway commented 3 months ago

Currently to use a lot of numpy functionality, a user must first cast their series to a standard pint Quantity array. It would be great if operating on unit aware pandas series with numpy could be as seamless as working with regular pint (not pint-pandas) and numpy.

Example:

import pandas as pd
import pint_pandas as pd

df = pd.DataFrame({'angle_col': PintArray([1,2,3], 'degrees'), 'area_col': PintArray([4,9,16], 'inches**2')})

#desired behavior (this will throw an error currently)
df1 = df.assign(
    cos_col = np.cos(df.angle_col),
    length_col = np.sqrt(df.area_col),
)

#currently required behavior for this to work
df1 = df.assign(
    cos_col = np.cos(df.angle_col.pint.quantity),
    length_col = np.sqrt(df.area_col.pint.quantity),
)

The currently required behavior is very verbose and requires a ton of extra typing and mental overhead when working with large datasets that have a ton of columns.

I think this functionality might be related to Issue #65

andrewgsavage commented 3 months ago

yea that would be very nice to have working

for now you can use df['cos_col'] = df.angle_col.apply(lambda x:np.cos(x))

andrewgsavage commented 1 month ago

this works now, when I run it I get

(<Quantity([1 2 3], 'degree')>,)
(<Quantity([ 4  9 16], 'inch ** 2')>,)

angle_col        pint[degree]
area_col      pint[inch ** 2]
cos_col                pint[]
length_col         pint[inch]
dtype: object