Open shobsi opened 2 months ago
Thanks for the report. This is due to the use of lib.fast_zip
in MultiIndex._values
. That uses PyArray_GETITEM
which returns a Python int instead of the NumPy scalar. In addition, in MultiIndex._values
we cast to object dtype of any ExtensionDtype. Patching these two so that the OP example works, I'm only seeing a handful of test failures, mostly in join.
Without a Multiindex, df.index._values
returns an IntegerArray that when indexed do give NumPy scalars. I'm wondering if we want to make this change to MultiIndex for the sake of consistency.
cc @jbrockmendel
Pandas version checks
[X] I have checked that this issue has not already been reported.
[X] I have confirmed this bug exists on the latest version of pandas.
[X] I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
Issue Description
I ran into this issue where I have a series processing udf, which works fine if I pass a dataframe row to it, but breaks when I
dataframe.apply
it withaxis=1
. It seems the numpy value in the index is being lost in the latter.Expected Behavior
A user defined function should have the same behavior, whether a dataframe row is passed to it directly or via
DataFrame.apply
.Installed Versions