hgrecco / pint-pandas

Pandas support for pint
Other
166 stars 41 forks source link

dimensionless series confuses pint-pandas #143

Closed MichaelTiemannOSC closed 1 year ago

MichaelTiemannOSC commented 1 year ago

The following sample code shows that pint-pandas knows how to multiple a dimensionless thing by a unit thing to get a multiplicative unit thing. But not always.

import pandas as pd
import pint
from pint import Quantity as Q_
import pint_pandas

df_pp = pd.DataFrame({2019: {('Global', 'Steel'): Q_(0.0, 'dimensionless'), ('Europe', 'Steel'): Q_(0.0, 'dimensionless')},
                      2020: {('Global', 'Steel'): Q_(0.00306, 'dimensionless'), ('Europe', 'Steel'): Q_(0.00841, 'dimensionless')},
                      2021: {('Global', 'Steel'): Q_(0.00306, 'dimensionless'), ('Europe', 'Steel'): Q_(0.00841, 'dimensionless')},
                      2022: {('Global', 'Steel'): Q_(0.00306, 'dimensionless'), ('Europe', 'Steel'): Q_(0.00841, 'dimensionless')},
                      2023: {('Global', 'Steel'): Q_(0.00306, 'dimensionless'), ('Europe', 'Steel'): Q_(0.00841, 'dimensionless')},
                      2024: {('Global', 'Steel'): Q_(0.00306, 'dimensionless'), ('Europe', 'Steel'): Q_(0.00841, 'dimensionless')},
                      2025: {('Global', 'Steel'): Q_(0.00306, 'dimensionless'), ('Europe', 'Steel'): Q_(0.00841, 'dimensionless')},})
df_partial_pp = df_pp.add(1.0).cumprod(axis=1)

print(df_partial_pp)

base_year_production = Q_(123.45, 't')

co_cumprod = df_partial_pp.loc["Global", "Steel"] * base_year_production

print(co_cumprod)

co_cumprod_values = df_partial_pp.loc["Global", "Steel"].values * base_year_production

print(co_cumprod_values)

In the first case, where we multiply a dimensionless Series by a t of production, we get a dimensionless result. In the second case, using the values slot of the Series, we get the correctly quantified amount.

andrewgsavage commented 1 year ago

you are not using pint-pandas here; the dataframe is object dtype

df_pp.dtypes

2019    object
2020    object
2021    object
2022    object
2023    object
2024    object
2025    object
dtype: object
MichaelTiemannOSC commented 1 year ago

Now that df_pp.astype('pint[]') works, the fix for my problem is obvious.