Open DamianBarabonkovQC opened 2 years ago
Is there anything that needs to be adressed regarding this in kartothek
?
I have a hacky patch in filter_array_like
that looks like:
with np.errstate(invalid="ignore"):
if op == "==":
if pd.isnull(value):
np.logical_and(pd.isnull(array_like), mask, out=out)
else:
res_eq = array_like == value
np.logical_and(res_eq.fillna(False), mask, out=out)
basically filling in any NA with False
during the comparison before giving it up to np.logical_and
Problem description
In an older version of pandas (before pandas commit https://github.com/pandas-dev/pandas/commit/b2d54d9c16990bd8eaeacd4de24fc33cfdbfb43b), when
filter_array_like
saw a pd.NA in the context of a pandas BooleanArray, it treated it as a False. In newer versions (after https://github.com/pandas-dev/pandas/commit/b2d54d9c16990bd8eaeacd4de24fc33cfdbfb43b), the pd.NA is treated as pd.NA, which when casting to a numpy array causes an error.This relates to the pandas issue: https://github.com/pandas-dev/pandas/issues/45249 which is actually a new behavioral change and not a BUG. The old functionality of treating pd.NA as False was a bug actually.
Example code (ideally copy-pastable)
Please provide a minimal reproducible code example to reproduce the behavior,
Used versions