hgrecco / pint-pandas

Pandas support for pint
Other
166 stars 41 forks source link

Support for pandas query #162

Open swifmaneum opened 1 year ago

swifmaneum commented 1 year ago

Hello,

I wanted to ask if the pandas query function is supported in pint-pandas or if there are plans on supporting it.

I haven't got this to work and I couldn't find any information in the docs or any issue regarding this topic.

What I've tried and what I could imagine the support to look like:

import pandas as pd
from pint import Quantity as Q

df = pd.DataFrame({
    "torque": pd.Series([1, 2, 2, 3], dtype="pint[lbf ft]"),
    "angular_velocity": pd.Series([1, 2, 2, 3], dtype="pint[rpm]"),
})
df['power'] = df['torque'] * df['angular_velocity']

df.query(f"torque >= {Q(2, 'pint[lbf ft]')}")

On the other hand I'd expect df.query(f"torque >= 2") to fail.

MichaelTiemannOSC commented 1 year ago

I made it work this way:

import pandas as pd
# must import pint_pandas to make "pint[]" understood
import pint_pandas
from pint import Quantity as Q

# eval cannot deal with class constructors as functions, so make it a function
def q_for_eval(magnitude, unit=None):
    return Q(magnitude, unit)

df = pd.DataFrame({
    "torque": pd.Series([1, 2, 2, 3], dtype="pint[lbf ft]"),
    "angular_velocity": pd.Series([1, 2, 2, 3], dtype="pint[rpm]"),
})
df['power'] = df['torque'] * df['angular_velocity']

# Cannot pass "pint[x]" to Q--it is for PintArrays only
# To access local namespace, prefix function name with @
df.query(f"torque >= @q_for_eval(2, 'lbf ft')")

Pint Pandas gives us the answer:

>>> df.query(f"torque >= @q_for_eval(2, 'lbf ft')")
  torque angular_velocity power
1    2.0              2.0   4.0
2    2.0              2.0   4.0
3    3.0              3.0   9.0
swifmaneum commented 1 year ago

Thanks @MichaelTiemannOSC : Prefixing with @ works for me:

quantity = Q(2, 'lbf ft')
print(df.query(f"torque >= @quantity"))

I'm still wondering if df.query(f"torque >= {Q(2, 'lbf ft')}") should actually work or is the error expected behaviour?

mutricyl commented 4 weeks ago

@quantity will make the query with the actual value of quantity variable while the fstring will use repr of quantity. df.query(f"torque >= {Q(2, 'lbf ft')}") is equivalent to df.query("torque >= 2 foot * force_pound") which does not work as foot and force_pound are not interpreted as pint units and they might be confused with df columns.

details in this pandas closed issue