hgrecco / pint-pandas

Pandas support for pint
Other
169 stars 42 forks source link

Pr/137 compat check #245

Closed MichaelTiemannOSC closed 1 month ago

MichaelTiemannOSC commented 1 month ago

PR number 137 adds _get_common_dtype to PintType so that PintType operations can be performed on a mix of PintType and numeric values (with the later being promoted to the PintType for the purposes of the operation). However, when there are multiple PintType elements present, it is important that all elements are in fact compatible, lest the operation attempt to combine two PintType elements that are not unit-compatible.

MichaelTiemannOSC commented 1 month ago

It appears that the new compatibility check prevents the eval function from operating as expected under Pandas 3.0.0.

@mutricyl

MichaelTiemannOSC commented 1 week ago

This code used to work, but now fails:

import pandas as pd
import numpy as np
import pint_pandas

km = pd.Series([1.0, 2.0, np.nan], dtype="pint[km]")
kg = pd.Series([1.0, 2.0, np.nan], dtype="pint[kg]")

xx = pd.DataFrame({"a": km, "b": km})
yy = pd.DataFrame({"a": kg, "b": kg})

zz = pd.concat([xx, yy], axis=0).reset_index()

The [ed2b198](https://github.com/hgrecco/pint-pandas/pull/245/commits/ed2b198a5c0c624d91f971a00f9a49ecc10ea6ba) change is the culprit, I believe.

Note that if we allow PintType to set kind to 'O' (commented out at line 52 of pint_array.py), then PintType behaves well enough for pd.concat to do its job.

mutricyl commented 1 week ago

@MichaelTiemannOSC I do not quite understand how you code used to work since it tries to mix km and kg in the same column which is obviously not possible. Is there an copy/paste issue in your code ?

MichaelTiemannOSC commented 1 week ago

A pandas data frame can collect heterogenous types. They won't be proper PintArrays but they won't fail, either. They are just a series of Quantities with type object.